[Numpy-discussion] NEP mask code and the 1.7 release

Ralf Gommers ralf.gommers@googlemail....
Mon Apr 23 12:18:24 CDT 2012

On Mon, Apr 23, 2012 at 12:15 AM, Nathaniel Smith <njs@pobox.com> wrote:

> We need to decide what to do with the NA masking code currently in
> master, vis-a-vis the 1.7 release. While this code is great at what it
> is, we don't actually have consensus yet that it's the best way to
> give our users what they want/need -- or even an appropriate way. So
> we need to figure out how to release 1.7 without committing ourselves
> to supporting this design in the future.
> Background: what does the code currently in master do?
> --------------------------------------------
> It adds 3 pointers at the end of the PyArrayObject struct (which is
> better known as the numpy.ndarray object). These new struct members,
> and some accessors for them, are exposed as part of the public API.
> There are also a few additions to the Python-level API (mask= argument
> to np.array, skipna= argument to ufuncs, etc.)
> What does this mean for compatibility?
> ------------------------------------------------
> The change in the ndarray struct is not as problematic as it might
> seem, compatibility-wise, since Python objects are almost always
> referred to by pointers. Since the initial part of the struct will
> continue to have the same memory layout, existing source and binary
> code that works with PyArrayObject *pointers* will continue to work
> unchanged.
> One place where the actual struct size matters is for any C-level
> ndarray subclasses, which will have their memory layout change, and
> thus will need to be recompiled. (Python-level ndarray subclasses will
> have their memory layout change as well -- e.g., they will have
> different __dictoffset__ values -- but it's unlikely that any existing
> Python code depends on such details.)
> What if we want to change our minds later?
> -------------------------------------------------------
> For the same reasons as given above, any new code which avoids
> referencing the new struct fields referring to masks, or using the new
> masking APIs, will continue to work even if the masking is later
> removed.
> Any new code which *does* refer to the new masking APIs, or references
> the fields directly, will break if masking is later removed.
> Specifically, source will fail to compile, and existing binaries will
> silently access memory that is past the end of the PyArrayObject
> struct, which will have unpredictable consequences. (Most likely
> segfaults, but no guarantees.) This applies even to code which simply
> tries to check whether a mask is present.
> So I think the preconditions for leaving this code as-is for 1.7 are
> that we must agree:
>  * We are willing to require a recompile of any C-level ndarray
> subclasses (do any exist?)

As long as it's only subclasses I think this may be OK. Not 100% sure on
this one though.

>  * We are willing to make absolutely no guarantees about future
> compatibility for code which uses APIs marked "experimental"

That is what I understand "experimental" to mean. Could stay, could change
- no guarantees.

>  * We are willing for this breakage to occur in the form of random
> segfaults

This is not OK of course. But it shouldn't apply to the Python API, which I
think is the most important one here.

>  * We are okay with the extra 3 pointers worth of memory overhead on
> each ndarray
> Personally I can live with all of these if everyone else can, but I'm
> nervous about reducing our compatibility guarantees like that, and
> we'd probably need, at a minimum, a flashier EXPERIMENTAL sign than we
> currently have. (Maybe we should resurrect the weasels ;-) [1])
> [1]
> http://mail.scipy.org/pipermail/numpy-discussion/2012-March/061204.html


> I'm personally willing to implement either of these changes.

Thank you Nathaniel, that is a very important and helpful statement.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120423/91823888/attachment.html 

More information about the NumPy-Discussion mailing list