Re: Numpy-discussion Digest, Vol 19, Issue 44

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue 44

Joe Harrington
> Absolutely.  Let's please standardize on:
> import numpy as np
> import scipy as sp

I hope we do NOT standardize on these abbreviations.  While a few may
have discussed it at a sprint, it hasn't seen broad discussion and
there are reasons to prefer the other practice (numpy as N, scipy as
S, pylab as P).  My reasons for saying this go back to my reasons for
disliking lots of heirarchical namespaces at all: if we must have
namespaces, let's minimize the visual and typing impact by making them
short and visually distinct from the function names (by capitalizing
them).

What concerns me about the discussion is that we are still not
thinking like communications and thought-process experts, we are
thinking like categorizers and accountants.  The arguments we are
raising don't have to do, positively or negatively, with the difficult
acts of communicating with a computer and with other readers of our
code.  Those are the sole purposes of computer languages.

Namespaces add characters to code that have a high redundancy factor.
This means they pollute code, make it slow and inaccurate to read, and
making learning harder.  Lines get longer and may wrap if they contain
several calls.  It is harder while visually scanning code to
distinguish the function name if it's adjacent to a bunch of other
text, particularly if that text appears commonly in the nearby code.
It therefore becomes harder to spot bugs.  Mathematical code becomes
less and less like the math expressions we write on paper when doing
derivations, making it harder to interpret and verify.  You have to
memorize which subpackage each function is in, which is hard to do for
those functions that could naturally go in two subpackages.  While
many math function names are obvious, subpackage names are not.  Is it
.stat or .stats or .statistics?  .rand or .random?  .fin or
.financial?  Some functions have this problem, but *every* namespace
name has it in spades.

The arguments people are raising are arguments related to how
emotionally satisfying it is to have a place for everything and
everything in its place, and to know you know everything there is to
know.  While we like both those things, as scientists, engineers, and
mathematicians, they are almost irrelevant to coding.  There is simply
no reduction in readability, writeability, or debugability if you
don't have namespace prefixes on everything, and knowing you know
everything is easily accomplished now with the online categorized
function list.  We can incorporate that functionality into the doc
reading apparatus ("help", currently) by using keywords in ReST
comments in the docstrings and providing a way for "help" and its
friends to list the keywords and what functions are connected to them.

What nobody has said is "if we have lots of namespaces, my code will
look prettier" or "if we have lots of namespaces, normal people will
learn faster" or "if we have lots of namespaces, my code will be
easier to verify and debug".  I don't believe any of these statements
to be true.  Do you?

Similarly, nobody has said, "if we have lots of namespaces, I'll be a
faster coder".  There is a *very* high obnoxiousness factor in typing
redundant stuff at an interpreter.  It's already annoying to type
N.sin instead of sin, but N.T.sin?  Or worse, np.tg.sin?  Now the
prefix has twice the characters of the function itself!  Most IDL
users *hate* that you have to type "print, " in order to inspect the
contents of a variable.  Yet, with multiple layers of namespaces we'd
have lots more than seven extra characters on most lines of code, and
unlike the IDL mess you'd have to *think* to recall what the right
extra characters were for each function call, unlike just telling your
hands to run the "print, " finger macro once again.

The reasons we all like Python relate to how quick and easy it is to
emit code from our fingertips that is similar to what we are thinking
in our brains, compared to other languages.  The brain doesn't declare
variables, nor run loops over arrays.  Neither does Python.  When we
average up the rows of a 2D array and subtract that average from the
image, we don't first imagine making a new 2D array by repeating the
averaged row, and neither does Python, it just broadcasts behind the
scenes.  I could go on, and so could all of you.  Python feels more
like thought than other languages.

But now we are talking about breaking this commitment to lightness of
code text, learnability, readability, and debugability by adding layer
upon layer of prefixes to all the functions we write.

There is a vital place for namespaces.  Using import *, or not having
namespaces at all, has unpredictable consequences, especially in the
future when someone may add a function with a name identical to one
you are using to one of the packages you import, breaking existing
code.  Namespaces make it possible for two developers who are not in
communication to produce different packages that contain the same
names, and not worry.  This is critical in open source, so we live
with it or we go back to declaring our functions, as in C.  We can
reduce the impact by sticking with short, distinctive abbreviations
(capital N rather than lowercase np) and by not going heirarchical.
Where we need multiple packages, we should have them at the top level,
and not heirarchical.  I'll go so far as to suggest that if scipy must
have multiple packages within it, we could have them each be their own
top-level package, and drop the "scipy." (or "S.", or "sp.") prefix
entirely.  They can still be tested as a unit and released together if
we want that.  There is no problem with doing it this way that good
documentation does not fix.  I'd rather flatten scipy, however,
because the main reason to have namespaces is still satisfied that
way.  Of course, we should break the docs down as it's currently
packaged, for easier learning and management.  We just don't have to
instantiate that into the language itself.

What worries me is that the EXPERIENCE of reading and writing code in
Python is not much being raised in this discussion, when it should be
the *key* topic of any argument about the direction of the language.
So, in closing, I'd like to exhort everyone to try harder to think
like a sociologist, psychologist, and linguist in addition to thinking
like a computer scientist, physicist, or mathematician.  A computer
language is a means for communicating with a computer, and with others
who may use the code later.  We use languages like Python over the
much-faster assembly for a single reason: We spend too much time
coding, and it is faster and more accurate for the author and reader
to produce and consume code in Python than in assembly - or any other
language.

Let our guiding principle be to make this ever more true.

--jh--
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue 44

Robert Kern-2
Please do not respond to digest messages. If you want to respond to
messages, subscribe to receive messages individually. Respond to the
just messages you are interested in and keep the Subject lines intact.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue 44

Jarrod Millman
In reply to this post by Joe Harrington
On Wed, Apr 9, 2008 at 11:21 PM, Joe Harrington <[hidden email]> wrote:

> > Absolutely.  Let's please standardize on:
>  > import numpy as np
>  > import scipy as sp
>
>  I hope we do NOT standardize on these abbreviations.  While a few may
>  have discussed it at a sprint, it hasn't seen broad discussion and
>  there are reasons to prefer the other practice (numpy as N, scipy as
>  S, pylab as P).  My reasons for saying this go back to my reasons for
>  disliking lots of heirarchical namespaces at all: if we must have
>  namespaces, let's minimize the visual and typing impact by making them
>  short and visually distinct from the function names (by capitalizing
>  them).

Using single capital letters was discussed and dismissed.  The
standard abbreviations should be at least two letters and they should
follow the Python naming convention for packages (i.e., all
lowercase).  The single upper case letter actually uses two keys
anyway.

Following the convention used by the NumPy C-API and as suggested by
the camelcase spelling, it was agreed to abbreviate numpy as np.
After that we agreed to follow this pattern and name scipy sp.  We
also spoke with John Hunter and some of the matplotlib developers and
agreed that pylab would be abbreviated as plt.  So in summary:

import numpy as np
import scipy as sp
import pylab as plt

Why I don't want to shut anyone out of the discussion, I hope we can
just agree to use these standards.  There is quite a bit of work to do
and too few hands and too little time.

Thanks,

--
Jarrod Millman
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue 44

Gael Varoquaux
In reply to this post by Joe Harrington
On Thu, Apr 10, 2008 at 02:21:05AM -0400, Joe Harrington wrote:
> I'll go so far as to suggest that if scipy must
> have multiple packages within it, we could have them each be their own
> top-level package, and drop the "scipy." (or "S.", or "sp.") prefix
> entirely.  

Sound like C-type namespaces done with reserved prefixes. History has
shown this is a dead-end, don't waist your time on it.

Gaƫl
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue 44

Matthew Brett
Hi Joe,

I do see your point - and agree that typing and cruft make code harder
to write and read, to some extent.  But I think the point is - and I'm
increasingly finding this - that a judicious use of namespaces and
'from' statements makes the code much easier to read, as in

import numpy as np
from numpy import dot, any, arange, reshape

and so on.  And, personally, I tend to prefer to use lower case for
modules, and the occasional upper case for loop variables and the
like:

for L in lines:
    print L

type thing.  Upper case also draws the eye to the capital letter, so

print N.sin(a)

pulls the eye to the N, so you have to disengage and remind yourself
that it's the sin(a) that is important, whereas:

print np.sin(a)

less so - in my view.  So, as a previous user of 'import numpy as N',
I prefer 'import numpy as np'

Best,

Matthew
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue 44

Sebastian Haase
On Thu, Apr 10, 2008 at 11:05 AM, Matthew Brett <[hidden email]> wrote:

> Hi Joe,
>
>  I do see your point - and agree that typing and cruft make code harder
>  to write and read, to some extent.  But I think the point is - and I'm
>  increasingly finding this - that a judicious use of namespaces and
>  'from' statements makes the code much easier to read, as in
>
>  import numpy as np
>  from numpy import dot, any, arange, reshape
>
>  and so on.  And, personally, I tend to prefer to use lower case for
>  modules, and the occasional upper case for loop variables and the
>  like:
>
>  for L in lines:
>     print L
>
>  type thing.  Upper case also draws the eye to the capital letter, so
>
>  print N.sin(a)
>
>  pulls the eye to the N, so you have to disengage and remind yourself
>  that it's the sin(a) that is important, whereas:
>
>  print np.sin(a)
>
>  less so - in my view.  So, as a previous user of 'import numpy as N',
>  I prefer 'import numpy as np'
>
>  Best,
>
>  Matthew
>
I hope I won't we excluded from further discussions if I would prefer
to stick with my "single capital" approach for my "every day modules".
I already have a default, "import numpy as N" (and some others of my
own modules) for more then 6 years. All over my code.
I've just gotten really used to it .....
At least Chris Barker seems to like the capital "N" also ;-)

Most important:
            please don't use "from numpy import *"
I like being reminded of the (495 or so) names coming from there ....

Cheers,
Sebastian Haase
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue 44

Hans Meine-4
In reply to this post by Matthew Brett
Am Donnerstag, 10. April 2008 11:05:35 schrieb Matthew Brett:

> type thing.  Upper case also draws the eye to the capital letter, so
>
> print N.sin(a)
>
> pulls the eye to the N, so you have to disengage and remind yourself
> that it's the sin(a) that is important, whereas:
>
> print np.sin(a)
>
> less so - in my view.  So, as a previous user of 'import numpy as N',
> I prefer 'import numpy as np'

+1

--
Ciao, /  /
     /--/
    /  / ANS
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue 44

Jarrod Millman
In reply to this post by Sebastian Haase
On Thu, Apr 10, 2008 at 2:31 AM, Sebastian Haase <[hidden email]> wrote:
>  I hope I won't we excluded from further discussions if I would prefer
>  to stick with my "single capital" approach for my "every day modules".
>  I already have a default, "import numpy as N" (and some others of my
>  own modules) for more then 6 years. All over my code.
>  I've just gotten really used to it .....
>  At least Chris Barker seems to like the capital "N" also ;-)

I would really like to see everyone starting to follow the same
conventions for naming, documentation, testing, writing extension
code, etc. as much as possible.  When writing new code please use the
agreed upon standard.  No need to spend your time changing your code
now, but maybe you could start developing the habit of using the
standards.  I am sorry that we can't all agree on these standards, but
some of it will obviously just come down to personal tastes--which
means that someone will have to compromise.

>  Most important:
>             please don't use "from numpy import *"
>  I like being reminded of the (495 or so) names coming from there ....

+1, Absolutely!!

--
Jarrod Millman
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: numpy namespaces

Chris Barker - NOAA Federal
In reply to this post by Joe Harrington

[renaming thread due to digest reply...]

Joe Harrington wrote:

lots of good points...

> What concerns me about the discussion is that we are still not
> thinking like communications and thought-process experts, we are
> thinking like categorizers and accountants.

Well, yes and no. I'm thinking like myself -- measuring usability by me
experience, which, I acknowledge doesn't necessarily represent anyone
else, but I am very much in the target of numpy users.

> Namespaces add characters to code that have a high redundancy factor.

I don't think it's high redundancy -- where a name comes from is
relevant, particularly when reading code. I don't do it, but let's face
it, lot's of folks using code without namespaces preface names with
prefix for the package it came from, and may even prefix it with a code
for the type of the variable (Hungarian notation, isn't it?). Anyway,
the point is, this is done a lot because people find it useful to have
extra info in the names.

> Lines get longer and may wrap if they contain several calls.

Which is why we want "N" or "np" rather than, say, "Numeric".

 > Mathematical code becomes
> less and less like the math expressions we write on paper when doing
> derivations, making it harder to interpret and verify.

This is true, and I suppose why a number of folks do:

from numpy import sin, cos, ......

However, I find there really is a small fraction of my code that's a
math expression.

>  Is it .stat or .stats or .statistics?  .rand or .random?  .fin or
> .financial?  Some functions have this problem, but *every* namespace
> name has it in spades.

I suppose so, but it doesn't feel onerous to me -- I d remember spending
lots of time looking up names in Matlab -- only one namespace, but lots
of names -- as long as we have a way to search across namespaces, I'll
be happy.

> There is simply
> no reduction in readability, writeability, or debugability if you
> don't have namespace prefixes on everything,

I don't think it's that simple. I know for a fact that when _I_ read old
flat namespace code, I spend a lot of time trying to figure out where
the heck a given function came from.

> We can incorporate that functionality into the doc
> reading apparatus

The doc reading apparatus can make up more many of the limitations you
defined too. At least we all seem to agree that powerful doc reading
apparatus is key to usability.

> What nobody has said is "if we have lots of namespaces, my code will
> look prettier"

I'll grant you that, but pretty isn't really my most important criteria

  or "if we have lots of namespaces, normal people will
> learn faster" or "if we have lots of namespaces, my code will be
> easier to verify and debug".

OK:

If we have more (which is not necessarily lots) of namespaces, but code
will be easier to write, debug and verify.

If we have more namespaces, it will be easier to teach newbies, and,
critically, easier for newbies to become programmers.

> Similarly, nobody has said, "if we have lots of namespaces, I'll be a
> faster coder".  There is a *very* high obnoxiousness factor in typing
> redundant stuff at an interpreter.  It's already annoying to type
> N.sin instead of sin, but N.T.sin?

Maybe we need to be more specific -- are you arguing for "from pylab
import *", or are you arguing for a small number of namespaces:

import numpy as np
import pylab as plt

'cause there can certainly be too much of a good thing!

Anyway, Here is my personal experience: when I started using python some
years ago, I used both:

from Numeric import *
and
from wxPython import *

I (and both communities, for the most part) does neither. I am
definitely much happier reading and writing code with the namespaces.

> Most IDL users *hate* that you have to type "print, "...

This is a key issue -- for interactive use, the amount of typing does
become an issue, and flat namespaces do have a real advantage there.
However, I was a major Matlab user for years, and Python user for years
now, and while I do play and test things out at the interactive prompt,
I almost always regret it when I write more than 3 lines at the prompt,
rather than putting them in a file, so it's hard for me to see
interactive use as a priority.

And there is still a place for something like pylab, that dumps a whole
bunch of stuff into a single namespace for interactive use -- I just
won't use it.

> What worries me is that the EXPERIENCE of reading and writing code in
> Python

Frankly, while we aren't, on the whole linguists (come to think of it,
isn't Larry Wall a linguist? -- and look where that lead!), neither are
we primarily computer scientists -- I think most of us are basing our
opinions on little else than OUR EXPERIENCE.

A lot of software usability suffers from the fact that the folks writing
the program aren't the the target users -- that's not that case here --
we are the target users! Yes, newbies and
less-interested-in-programming-than-us scientists are another target,
but they are not so different. We're not trying to write AppleScript here.

Sebastian Haase wrote:
> I've just gotten really used to it .....
> At least Chris Barker seems to like the capital "N" also ;-)

For what it's worth, yes, that's what I settled on, but I think
standardization is far more important than my minor preference -- "np"
it is for me from now on.


-Chris



--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue, 44

James Turner-4
In reply to this post by Robert Kern-2
Hi Robert et al.,

> Please do not respond to digest messages. If you want to respond to
> messages, subscribe to receive messages individually. Respond to the
> just messages you are interested in and keep the Subject lines intact.

Just a suggestion, which I hope doesn't annoy anyone :-).

I receive the NumPy and SciPy digests because the amount of traffic
on the lists is so high and my time to keep up with them is so limited
(there is probably a way to filter emails in Thunderbird that would
work, but I still have to figure that out). I found a way to reply to
posts from the digest that is fiddly but avoids the problem that
Robert points out, for occasional posters...

  1. Press reply
  2. Copy the subject line from the original email you want to reply
     to and paste it over the subject line in the new email window.
  3. Copy the "Message-ID" string from the original email you are
     replying to and paste it into both a "References:" line and a
     "In-Reply-To:" line in the message header (where you specify
     the recipients etc.).

I hope I got that right, since it's been a while... In order to get
"References:" and "In-Reply-To:" in the pull-down boxes in Thunderbird,
I had to add an entry in the prefs.js file of my profile as follows:

   user_pref("mail.compose.other.header", "References,In-Reply-To");

Sorry to perpetuate the subject line, but I did keep it intact! Hope
that helps someone (and keeps everyone happy).

Cheers,

James.

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue, 44

Travis Oliphant-5

James,

Thank you for that very helpful set of suggestions.  I think the main
thing is to keep the subject heading so that we can parse which
conversation the response is targeted to.  

Also, not including the entire digest in the response is useful as well.

We don't want to push occasional posters away (indeed often the most
useful comments will come from them because they may not be caught up in
the history of any particular decision and provide a fresh perspective).

Please post :-)  just use the relevant subject line and filter out non
relevant content in the response.

Best regards,

-Travis O.




> Just a suggestion, which I hope doesn't annoy anyone :-).
>
> I receive the NumPy and SciPy digests because the amount of traffic
> on the lists is so high and my time to keep up with them is so limited
> (there is probably a way to filter emails in Thunderbird that would
> work, but I still have to figure that out). I found a way to reply to
> posts from the digest that is fiddly but avoids the problem that
> Robert points out, for occasional posters...
>
>   1. Press reply
>   2. Copy the subject line from the original email you want to reply
>      to and paste it over the subject line in the new email window.
>   3. Copy the "Message-ID" string from the original email you are
>      replying to and paste it into both a "References:" line and a
>      "In-Reply-To:" line in the message header (where you specify
>      the recipients etc.).
>
> I hope I got that right, since it's been a while... In order to get
> "References:" and "In-Reply-To:" in the pull-down boxes in Thunderbird,
> I had to add an entry in the prefs.js file of my profile as follows:
>
>    user_pref("mail.compose.other.header", "References,In-Reply-To");
>
> Sorry to perpetuate the subject line, but I did keep it intact! Hope
> that helps someone (and keeps everyone happy).
>  


> Cheers,
>
> James.
>
> _______________________________________________
> Numpy-discussion mailing list
> [hidden email]
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>  

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Numpy-discussion Digest, Vol 19, Issue, 44

Steve Lianoglou-6
In reply to this post by James Turner-4
Hi,

> I receive the NumPy and SciPy digests because the amount of traffic
> on the lists is so high and my time to keep up with them is so limited
> (there is probably a way to filter emails in Thunderbird that would
> work, but I still have to figure that out).

Why not just create a filter that (i) marks the message as read, and  
(ii) files it into a numpy folder. You can then just look in there  
whenever you feel like you have the time/inclination and the high  
traffic should go pretty much unnoticed, no?

It's what I do, anyway ... oh, and also having the mails sent to a  
list-only email address helps, too.

-steve

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion