[Numpy-discussion] "import numpy" is slow

Robert Kern robert.kern@gmail....
Thu Jul 31 15:02:54 CDT 2008

On Thu, Jul 31, 2008 at 05:43, Andrew Dalke <dalke@dalkescientific.com> wrote:
> On Jul 31, 2008, at 12:03 PM, Robert Kern wrote:

>> But you still can't remove them since they are being used inside
>> numerictypes. That's why I labeled them "internal utility functions"
>> instead of leaving them with minimal docstrings such that you would
>> have to guess.
> My proposal is to replace that code with a table mapping
> the type name to the uppercase/lowercase/capitalized forms,
> thus eliminating the (small) amount of time needed to
> import string.
> It makes adding new types slightly more difficult.
> I know it's a tradeoff.

Probably not a bad one. Write up the patch, and then we'll see how
much it affects the import time.

I would much rather that we discuss concrete changes like this rather
than rehash the justifications of old decisions. Regardless of the
merits about the old decisions (and I agreed with your position at the
time), it's a pointless and irrelevant conversation. The decisions
were made, and now we have a user base to whom we have promised not to
break their code so egregiously again. The relevant conversation is
what changes we can make now.

Some general guidelines:

1) Everything exposed by "from numpy import *" still needs to work.
  a) The layout of everything under numpy.core is an implementation detail.
  b) _underscored functions and explicitly labeled internal functions
can probably be modified.
  c) Ask about specific functions when in doubt.

2) The improvement in import times should be substantial. Feel free to
bundle up the optimizations for consideration.

3) Moving imports from module-level down into the functions where they
are used is generally okay if we get a reasonable win from it. The
local imports should be commented, explaining that they are made local
in order to improve the import times.

4) __import__ hacks are off the table.

5) Proxy objects ... I would really like to avoid proxy objects. They
have caused fragility in the past.

6) I'm not a fan of having environment variables control the way numpy
gets imported, but I'm willing to consider it. For example, I might go
for having proxy objects for linalg et al. *only* if a particular
environment variable were set. But there had better be a very large
improvement in import times.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco

More information about the Numpy-discussion mailing list