[Numpy-discussion] slow import of numpy modules

Robert Kern robert.kern@gmail....
Wed Jul 2 23:36:31 CDT 2008


On Wed, Jul 2, 2008 at 23:14, David Cournapeau
<cournapeau@cslab.kecl.ntt.co.jp> wrote:
> On Wed, 2008-07-02 at 21:50 -0500, Robert Kern wrote:
>>
>> So ... what were you referring to?
>
> To a former email from Matthieu in this thread (or Stefan ?).

Neither one has participated in this thread. At least, no such email
has made it to my inbox.

>> There is special purpose code, yes. We used to use it to load proxy
>> objects for scipy subpackages such that "import scipy" would have
>> scipy.stats semi-immediately available. We have stopped using it
>> because of fragility, confusing behavior at the interpreter, py2exe
>> problems, and my general abhorrence of things which mess too deeply
>> with imports. It is not a general-purpose solution for lazily-loading
>> stdlib modules, I don't think.
>
> I was afraid of something like this.
>
>> > Because we could
>> > win between 20 and 40 % time of import by lazily importing a few modules
>> > (namely urllib, which I guess it not often used, and already takes
>> > around 20-30 ms; inspect and compiler are takinh a long time too, but
>> > maybe those are always needed, I have not checked carefully). Maybe this
>> > would be complicated to implement for numpy, though.
>>
>> These imports could easily be pushed down into the handful of
>> functions that need them (with an appropriate comment about why they
>> are down there). There is no need to have complicated machinery
>> involved.
>>
>> Do you have a breakdown of the import costs?
>
> I don't have the precise timings/scripts at the moment, but even by
> using really crude method:
>        - urllib2 (in numpy.lib._datasource) by itself takes 30 ms from 180ms.
> That's an easy 20 % win, since it is not often called.
>        - inspect in numpy.lib.utils: this cost around 25 ms
>
> If I just comment the above imports, I go from 180 to 120 ms.

I think it's worth moving these imports into the functions, then.

> Then, something which takes a awful lot of time is finfo to get floating
> points limits. This takes like 30-40 ms. I wonder if there are some ways
> to make it faster. After that, there is no obvious spot I remember, but
> I can get them tonight when I go back to my lab.

They can all be turned into properties that look up in a cache first.
iinfo already does this.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco


More information about the Numpy-discussion mailing list