[Numpy-discussion] Making NumPy accessible to everyone (or no-one) (was Numpy-discussion Digest, Vol 19, Issue 44)

Neil Crighton neilcrighton@gmail....
Thu Apr 10 09:07:13 CDT 2008


Thanks Joe for the excellent post. It mirrors my experience with
Python and Numpy very eloquently, and I think it presents a good
argument against the excessive use of namespaces. I'm not so worried
about N. vs np. though - I use the same method Matthew Brett suggests.
If I'm going to use, say, sin and cos a lot in a script such that all
the np. prefixes would make the code hard to read, I'll use:

import numpy as np
from numpy import sin,cos

To those people who have invoked 'Namespaces are a honking great idea
- let's do more of those', I'll cancel that with 'Flat is better than
nested' :)  I certainly wouldn't argue that using namespaces to
separate categories of functions is always a bad thing, but I think it
should only be done as a last resort.

Neil

On 10/04/2008, Joe Harrington <jh@physics.ucf.edu> wrote:
>  > Absolutely.  Let's please standardize on:
>  > import numpy as np
>  > import scipy as sp
>
>  I hope we do NOT standardize on these abbreviations.  While a few may
>  have discussed it at a sprint, it hasn't seen broad discussion and
>  there are reasons to prefer the other practice (numpy as N, scipy as
>  S, pylab as P).  My reasons for saying this go back to my reasons for
>  disliking lots of heirarchical namespaces at all: if we must have
>  namespaces, let's minimize the visual and typing impact by making them
>  short and visually distinct from the function names (by capitalizing
>  them).
>
>  What concerns me about the discussion is that we are still not
>  thinking like communications and thought-process experts, we are
>  thinking like categorizers and accountants.  The arguments we are
>  raising don't have to do, positively or negatively, with the difficult
>  acts of communicating with a computer and with other readers of our
>  code.  Those are the sole purposes of computer languages.
>
>  Namespaces add characters to code that have a high redundancy factor.
>  This means they pollute code, make it slow and inaccurate to read, and
>  making learning harder.  Lines get longer and may wrap if they contain
>  several calls.  It is harder while visually scanning code to
>  distinguish the function name if it's adjacent to a bunch of other
>  text, particularly if that text appears commonly in the nearby code.
>  It therefore becomes harder to spot bugs.  Mathematical code becomes
>  less and less like the math expressions we write on paper when doing
>  derivations, making it harder to interpret and verify.  You have to
>  memorize which subpackage each function is in, which is hard to do for
>  those functions that could naturally go in two subpackages.  While
>  many math function names are obvious, subpackage names are not.  Is it
>  .stat or .stats or .statistics?  .rand or .random?  .fin or
>  .financial?  Some functions have this problem, but *every* namespace
>  name has it in spades.
>
>  The arguments people are raising are arguments related to how
>  emotionally satisfying it is to have a place for everything and
>  everything in its place, and to know you know everything there is to
>  know.  While we like both those things, as scientists, engineers, and
>  mathematicians, they are almost irrelevant to coding.  There is simply
>  no reduction in readability, writeability, or debugability if you
>  don't have namespace prefixes on everything, and knowing you know
>  everything is easily accomplished now with the online categorized
>  function list.  We can incorporate that functionality into the doc
>  reading apparatus ("help", currently) by using keywords in ReST
>  comments in the docstrings and providing a way for "help" and its
>  friends to list the keywords and what functions are connected to them.
>
>  What nobody has said is "if we have lots of namespaces, my code will
>  look prettier" or "if we have lots of namespaces, normal people will
>  learn faster" or "if we have lots of namespaces, my code will be
>  easier to verify and debug".  I don't believe any of these statements
>  to be true.  Do you?
>
>  Similarly, nobody has said, "if we have lots of namespaces, I'll be a
>  faster coder".  There is a *very* high obnoxiousness factor in typing
>  redundant stuff at an interpreter.  It's already annoying to type
>  N.sin instead of sin, but N.T.sin?  Or worse, np.tg.sin?  Now the
>  prefix has twice the characters of the function itself!  Most IDL
>  users *hate* that you have to type "print, " in order to inspect the
>  contents of a variable.  Yet, with multiple layers of namespaces we'd
>  have lots more than seven extra characters on most lines of code, and
>  unlike the IDL mess you'd have to *think* to recall what the right
>  extra characters were for each function call, unlike just telling your
>  hands to run the "print, " finger macro once again.
>
>  The reasons we all like Python relate to how quick and easy it is to
>  emit code from our fingertips that is similar to what we are thinking
>  in our brains, compared to other languages.  The brain doesn't declare
>  variables, nor run loops over arrays.  Neither does Python.  When we
>  average up the rows of a 2D array and subtract that average from the
>  image, we don't first imagine making a new 2D array by repeating the
>  averaged row, and neither does Python, it just broadcasts behind the
>  scenes.  I could go on, and so could all of you.  Python feels more
>  like thought than other languages.
>
>  But now we are talking about breaking this commitment to lightness of
>  code text, learnability, readability, and debugability by adding layer
>  upon layer of prefixes to all the functions we write.
>
>  There is a vital place for namespaces.  Using import *, or not having
>  namespaces at all, has unpredictable consequences, especially in the
>  future when someone may add a function with a name identical to one
>  you are using to one of the packages you import, breaking existing
>  code.  Namespaces make it possible for two developers who are not in
>  communication to produce different packages that contain the same
>  names, and not worry.  This is critical in open source, so we live
>  with it or we go back to declaring our functions, as in C.  We can
>  reduce the impact by sticking with short, distinctive abbreviations
>  (capital N rather than lowercase np) and by not going heirarchical.
>  Where we need multiple packages, we should have them at the top level,
>  and not heirarchical.  I'll go so far as to suggest that if scipy must
>  have multiple packages within it, we could have them each be their own
>  top-level package, and drop the "scipy." (or "S.", or "sp.") prefix
>  entirely.  They can still be tested as a unit and released together if
>  we want that.  There is no problem with doing it this way that good
>  documentation does not fix.  I'd rather flatten scipy, however,
>  because the main reason to have namespaces is still satisfied that
>  way.  Of course, we should break the docs down as it's currently
>  packaged, for easier learning and management.  We just don't have to
>  instantiate that into the language itself.
>
>  What worries me is that the EXPERIENCE of reading and writing code in
>  Python is not much being raised in this discussion, when it should be
>  the *key* topic of any argument about the direction of the language.
>  So, in closing, I'd like to exhort everyone to try harder to think
>  like a sociologist, psychologist, and linguist in addition to thinking
>  like a computer scientist, physicist, or mathematician.  A computer
>  language is a means for communicating with a computer, and with others
>  who may use the code later.  We use languages like Python over the
>  much-faster assembly for a single reason: We spend too much time
>  coding, and it is faster and more accurate for the author and reader
>  to produce and consume code in Python than in assembly - or any other
>  language.
>
>  Let our guiding principle be to make this ever more true.
>
>  --jh--
>
>


More information about the Numpy-discussion mailing list