[Numpy-discussion] Memory hungry reduce ops in Numpy
Tue Nov 15 11:07:54 CST 2011
On 11/15/2011 06:02 PM, Warren Weckesser wrote:
> On Tue, Nov 15, 2011 at 10:48 AM, Andreas Müller
> <firstname.lastname@example.org <mailto:email@example.com>> wrote:
> On 11/15/2011 05:46 PM, Andreas Müller wrote:
>> On 11/15/2011 04:28 PM, Bruce Southey wrote:
>>> On 11/14/2011 10:05 AM, Andreas Müller wrote:
>>>> On 11/14/2011 04:23 PM, David Cournapeau wrote:
>>>>> On Mon, Nov 14, 2011 at 12:46 PM, Andreas Müller
>>>>> <firstname.lastname@example.org> <mailto:email@example.com> wrote:
>>>>>> Hi everybody.
>>>>>> When I did some normalization using numpy, I noticed that numpy.std uses
>>>>>> more ram than I was expecting.
>>>>>> A quick google search gave me this:
>>>>>> The site claims that std and other reduce operations are implemented
>>>>>> naively with many temporaries.
>>>>>> Is that true? And if so, is there a particular reason for that?
>>>>>> This issues seems quite easy to fix.
>>>>>> In particular the link I gave above provides code.
>>>>> The code provided only implements a few special cases: being more
>>>>> efficient in those cases only is indeed easy.
>>>> I am particularly interested in the std function.
>>>> Is this implemented as a separate function or an instantiation
>>>> of a general reduce operations?
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org>
>>> The'On-line algorithm'
>>> could save you storage. I would presume if you know cython that
>>> you can probably make it quick as well (to address the loop over
>>> the data).
>> My question was more along the lines of "why doesn't numpy do the
>> online algorithm".
> To be more precise, even not using the online version but
> computing E(X^2) and E(X)^2 would be good.
> It seems numpy centers the whole dataset. Otherwise I can't
> explain why the memory needed should depend
> on the number of examples.
> Yes, that is what it is doing. See line 63 in the function _var(),
> which is called by _std():
Thanks for the clarification. I thought the function was somewhere in
the C code -
don't know why.
I'll see if I can reformulate the function.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion