[Numpy-discussion] feedback request: proposal to add masks to the core ndarray
Thu Jun 23 17:06:13 CDT 2011
On Thu, Jun 23, 2011 at 4:48 PM, Gael Varoquaux <
> On Thu, Jun 23, 2011 at 03:53:31PM -0500, Mark Wiebe wrote:
> > concluded that adding masks to the core ndarray appears is the best
> way to
> > deal with the problem in general.
> It seems to me that this is going to make the numpy array a way more
> complex object. Althought it is currently quite simple, that object has
> already a hard time getting acceptance beyond the scientific community,
> whereas it should really be used in many other places.
I don't think it's the simplicity or complexity of the object which is
preventing more acceptance, but rather the implementation issues it
currently has. I've done a lot to smooth out the rough edges already, for
example the cleanup to structured arrays I just merged fixes a number of
things which just worked poorly or couldn't be done before.
Right now, the numpy array can be seen as an extension of the C array,
> basically a pointer, a data type, and a shape (and strides). This enables
> easy sharing with libraries that have not been written with numpy in
It's not that simple, unfortunately, the C API does not provide a good
interface for this. I think a good C++ API needs to be created, because C
doesn't have enough expressive power to make interoperability simple.
The limitations of the subclassing approach that you mention do not seem
> fundemental to me. For instance the impossibility to mix subclasses could
> perhaps be solved using the Mixin Pattern. Ufuncs need work, but I have
> the impression that your proposal is simply to solve the special case of
> masked data in the ufunc by breaking the simple numpy array model.
This doesn't break the numpy array model, that stays exactly as it is. It
does provide a consistent way to deal with masks that align with those
arrays. The subclassing mechanism doesn't provide a way to give a robust
abstraction of the missing value phenomenon, and that is what my proposal is
providing. A simple API call for default behaviors when masks aren't
supported seems reasonable to me.
By moving in the core a growing amount of functionality, it seems to me
> that you are going to make it more and more complex while loosing its
> genericity. Each new feature will need to go in the core and induce a
> high cost. Making inheritance and unfuncs more generic seems to me like a
> better investment.
Putting masks into the core increases the genericity, because it makes that
feature orthogonal to all the other features in a very clean way. The
problems you are raising are potential issues if the implementation is done
poorly, not fundamental issues with the idea.
> My 2 cents,
> PS: I am on the verge of conference travel, so I will not be able to
> participate any further in the discussion.
Thanks for taking the time to respond,
> NumPy-Discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion