[Numpy-discussion] "import numpy" is slow

Stéfan van der Walt stefan@sun.ac...
Wed Jul 30 15:59:32 CDT 2008


2008/7/30 Andrew Dalke <dalke@dalkescientific.com>:
> Based on those results I've been digging into the code trying to
> figure out why numpy imports so many files, and at the same time I've
> been trying to guess at the use case Robert Kern regards as typical
> when he wrote:
>
>     Your use case isn't so typical and so suffers on the import
>     time end of the balance

I.e. most people don't start up NumPy all the time -- they import
NumPy, and then do some calculations, which typically take longer than
the import time.

> and trying to figure out what code would break if those modules
> weren't all eagerly imported and were instead written as most other
> Python modules are written.

For a benefit of 0.03s, I don't think it's worth it.

> I have two thoughts for why mega-importing might be useful:
>
>   - interactive users get to do tab complete and see everything
>        (eg, "import numpy" means "numpy.fft.ifft" works, without
>         having to do "import numpy.fft" manually)

Numpy has a very flat namespace, for better or worse, which implies
many imports.  This can't be easily changed without modifying the API.

> Is the numpy recommendation that people should do:
>
>   import numpy
>   numpy.fft.ifft(data)

That's the way many people use it.

> ?  If so, the documentation should be updated to say that "random",
> "ma", "ctypeslib" and several other libraries are included in that
> list.

Thanks for pointing that out, I'll edit the documentation wiki.

> Why is the last so important that it should be in the top-
> level namespace?

It's a single Python file -- does it make much of a difference?

> In my opinion, this assistance is counter to standard practice in
> effectively every other Python package.  I don't see the benefit.

How do you propose we change this?

> BTW, the load and save commands in io do an incorrect check.
>
>     if isinstance(file, type("")):
>         fid = _file(file,"rb")
>     else:
>         fid = file

Thanks, fixed.

[snip lots of suggestions]

> Getting rid of these functions, and thus getting rid of the import
> speeds numpy startup time by 3.5%.

While I appreciate you taking the time to find these niggles, but we
are short on developer time as it is.  Asking them to spend their
precious time on making a 3.5% improvement in startup time does not
make much sense.  If you provide a patch, on the other hand, it would
only take a matter of seconds to decide whether to apply or not.
You've already done most of the sleuth work.

> I could probably get another 0.05 seconds if I dug around more, but I
> can't without knowing what use case numpy is trying to achieve.  Why
> are all those ancillary modules (testing, ctypeslib) eagerly loaded
> when there seems no need for that feature?

Need is relative.  You need fast startup time, but most of our users
need quick access to whichever functions they want (and often use from
an interactive terminal).  I agree that "testing" and "ctypeslib" do
not belong in that category, but they don't seem to do much harm
either.

Regards
Stéfan


More information about the Numpy-discussion mailing list