[Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy?
Tue Sep 18 15:18:25 CDT 2012
On 18 Sep 2012 18:40, "Benjamin Root" <email@example.com> wrote:
> On Fri, Sep 7, 2012 at 12:05 PM, Nathaniel Smith <firstname.lastname@example.org> wrote:
>> On 7 Sep 2012 14:38, "Benjamin Root" <email@example.com> wrote:
>> > An issue just reported on the matplotlib-users list involved a user
who ran out of memory while attempting to do an imshow() on a large array.
While this wouldn't be totally unexpected, the user's traceback shows that
they ran out of memory before any actual building of the image occurred.
Memory usage sky-rocketed when imshow() attempted to determine the min and
max of the image. The input data was a masked array, and it appears that
the implementation of min() for masked arrays goes something like this
>> > obj.filled(inf).min()
>> > The idea is that any masked element is set to the largest possible
value for their dtype in a copied array of itself, and then a min() is
performed on that copied array. I am assuming that max() does the same
>> > Can this be done differently/more efficiently? If the "filled"
approach has to be done, maybe it would be a good idea to make the copy in
chunks instead of all at once? Ideally, it would be nice to avoid the
copying altogether and utilize some of the special iterators that Mark
Weibe created last year.
>> I think what you're looking for is where= support for ufunc.reduce. This
isn't implemented yet but at least it's straightforward in principle...
otherwise I don't know anything better than reimplementing .min() by hand.
> Yes, it was the where= support that I was thinking of. I take it that it
was pulled out of the 1.7 branch with the rest of the NA stuff?
where= was left in, but it was only implemented for regular vectorized
ufunc operations in the first place. Supporting it in reductions still
needs to be written.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion