[Numpy-discussion] A crazy masked-array thought

josef.pktd@gmai... josef.pktd@gmai...
Fri Apr 27 10:16:24 CDT 2012


On Fri, Apr 27, 2012 at 10:33 AM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Fri, Apr 27, 2012 at 8:15 AM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
>>
>>
>>
>> On Wed, Apr 25, 2012 at 9:58 AM, Richard Hattersley
>> <rhattersley@gmail.com> wrote:
>>>
>>> The masked array discussions have brought up all sorts of interesting
>>> topics - too many to usefully list here - but there's one aspect I haven't
>>> spotted yet. Perhaps that's because it's flat out wrong, or crazy, or just
>>> too awkward to be helpful. But ...
>>>
>>> Shouldn't masked arrays (MA) be a superclass of the plain-old-array
>>> (POA)?
>>>
>>> In the library I'm working on, the introduction of MAs (via numpy.ma)
>>> required us to sweep through the library and make a fair few changes. That's
>>> not the sort of thing one would normally expect from the introduction of a
>>> subclass.
>>>
>>> Putting aside the ABI issue, would it help downstream API compatibility
>>> if the POA was a subclass of the MA? Code that's expecting/casting-to a POA
>>> might continue to work and, where appropriate, could be upgraded in their
>>> own time to accept MAs.
>>>
>>
>> That's a version of the idea that all arrays have masks, just some of them
>> have "missing" masks. That construction was mentioned in the thread but I
>> can see how one might have missed it. I think it is the right way to do
>> things. However, current libraries and such will still need to do some work
>> in order to not do the wrong thing when a "real" mask was present. For
>> instance, check and raise an error if they can't deal with it.
>
>
> To expand a bit more, this is precisely why the current work on making masks
> part of ndarray rather than a subclass was undertaken. There is a flag that
> says whether or not the array is masked, but you will still need to check
> that flag to see if you are working with an unmasked instance of ndarray. At
> the moment the masked version isn't quite completely fused with
> ndarrays-classic since the maskedness needs to be specified in the
> constructors and such, but what you suggest is actually what we are working
> towards.
>
> No matter what is done, current functions and libraries that want to use
> masks are going to have to deal with the existence of both masked and
> unmasked arrays since the existence of a mask can't be ignored without
> risking wrong results.

(In case it's not the wrong thread)

If every ndarray has this maskflag, then it is easy to adjust other
library code.

if myarr.maskflag is not None: raise SorryException

What is expensive is having to do np.isnan(myarr) or
np.isfinite(myarr) everywhere.
https://github.com/scipy/scipy/pull/48

As a concept I like the idea, masked arrays are the general class with
generic defaults, "clean" arrays are a subclass where some methods are
overwritten with faster implementations.

Josef

>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list