[Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy?

Benjamin Root ben.root@ou....
Tue Sep 18 12:40:23 CDT 2012


On Fri, Sep 7, 2012 at 12:05 PM, Nathaniel Smith <njs@pobox.com> wrote:

> On 7 Sep 2012 14:38, "Benjamin Root" <ben.root@ou.edu> wrote:
> >
> > An issue just reported on the matplotlib-users list involved a user who
> ran out of memory while attempting to do an imshow() on a large array.
> While this wouldn't be totally unexpected, the user's traceback shows that
> they ran out of memory before any actual building of the image occurred.
> Memory usage sky-rocketed when imshow() attempted to determine the min and
> max of the image.  The input data was a masked array, and it appears that
> the implementation of min() for masked arrays goes something like this
> (paraphrasing here):
> >
> > obj.filled(inf).min()
> >
> > The idea is that any masked element is set to the largest possible value
> for their dtype in a copied array of itself, and then a min() is performed
> on that copied array.  I am assuming that max() does the same thing.
> >
> > Can this be done differently/more efficiently?  If the "filled" approach
> has to be done, maybe it would be a good idea to make the copy in chunks
> instead of all at once?  Ideally, it would be nice to avoid the copying
> altogether and utilize some of the special iterators that Mark Weibe
> created last year.
>
> I think what you're looking for is where= support for ufunc.reduce. This
> isn't implemented yet but at least it's straightforward in principle...
> otherwise I don't know anything better than reimplementing .min() by hand.
>
> -n
>
>
Yes, it was the where= support that I was thinking of.  I take it that it
was pulled out of the 1.7 branch with the rest of the NA stuff?

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120918/ebacc90f/attachment.html 


More information about the NumPy-Discussion mailing list