[Numpy-discussion] NEP mask code and the 1.7 release
Mon Apr 23 15:31:38 CDT 2012
On Mon, Apr 23, 2012 at 9:57 PM, Nathaniel Smith <email@example.com> wrote:
> On Mon, Apr 23, 2012 at 6:18 PM, Ralf Gommers
> <firstname.lastname@example.org> wrote:
> > On Mon, Apr 23, 2012 at 12:15 AM, Nathaniel Smith <email@example.com> wrote:
> >> We need to decide what to do with the NA masking code currently in
> >> master, vis-a-vis the 1.7 release. While this code is great at what it
> >> is, we don't actually have consensus yet that it's the best way to
> >> give our users what they want/need -- or even an appropriate way. So
> >> we need to figure out how to release 1.7 without committing ourselves
> >> to supporting this design in the future.
> >> Background: what does the code currently in master do?
> >> --------------------------------------------
> >> It adds 3 pointers at the end of the PyArrayObject struct (which is
> >> better known as the numpy.ndarray object). These new struct members,
> >> and some accessors for them, are exposed as part of the public API.
> >> There are also a few additions to the Python-level API (mask= argument
> >> to np.array, skipna= argument to ufuncs, etc.)
> >> What does this mean for compatibility?
> >> ------------------------------------------------
> >> The change in the ndarray struct is not as problematic as it might
> >> seem, compatibility-wise, since Python objects are almost always
> >> referred to by pointers. Since the initial part of the struct will
> >> continue to have the same memory layout, existing source and binary
> >> code that works with PyArrayObject *pointers* will continue to work
> >> unchanged.
> >> One place where the actual struct size matters is for any C-level
> >> ndarray subclasses, which will have their memory layout change, and
> >> thus will need to be recompiled. (Python-level ndarray subclasses will
> >> have their memory layout change as well -- e.g., they will have
> >> different __dictoffset__ values -- but it's unlikely that any existing
> >> Python code depends on such details.)
> >> What if we want to change our minds later?
> >> -------------------------------------------------------
> >> For the same reasons as given above, any new code which avoids
> >> referencing the new struct fields referring to masks, or using the new
> >> masking APIs, will continue to work even if the masking is later
> >> removed.
> >> Any new code which *does* refer to the new masking APIs, or references
> >> the fields directly, will break if masking is later removed.
> >> Specifically, source will fail to compile, and existing binaries will
> >> silently access memory that is past the end of the PyArrayObject
> >> struct, which will have unpredictable consequences. (Most likely
> >> segfaults, but no guarantees.) This applies even to code which simply
> >> tries to check whether a mask is present.
> >> So I think the preconditions for leaving this code as-is for 1.7 are
> >> that we must agree:
> >> * We are willing to require a recompile of any C-level ndarray
> >> subclasses (do any exist?)
> > As long as it's only subclasses I think this may be OK. Not 100% sure on
> > this one though.
> >> * We are willing to make absolutely no guarantees about future
> >> compatibility for code which uses APIs marked "experimental"
> > That is what I understand "experimental" to mean. Could stay, could
> change -
> > no guarantees.
> Earlier you said it meant "some changes are to be expected, but not
> complete removal", which seems different from "absolutely no
> So I just wanted to double-check whether you're revising that earlier
> opinion, or...?
Stay and change are both not the same as complete removal. But to spell it
out: if we release a feature, I expect it to stay in some form. That still
means we can change APIs (i.e. no compatibility for code written against
the old API), but not removing the concept itself. If we're not even sure
that the concept should stay, why bother releasing it as experimental?
Experimental is for finding out what works well, not for whether or not we
need some concept at all.
> >> * We are willing for this breakage to occur in the form of random
> >> segfaults
> > This is not OK of course. But it shouldn't apply to the Python API,
> which I
> > think is the most important one here.
> Right, this part is specifically about ABI compatibility, not API
> compatibility -- segfaults would only occur for extension libraries
> that were compiled against one version of numpy and then used with a
> different version.
That's what I suspected, but not what your earlier email said. I understood
your email as talking only about segfaults for code using the new NA C API.
Breaking ABI compatibility is a no-go.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion