[Numpy-discussion] in the NA discussion, what can we agree on?

Nathaniel Smith njs@pobox....
Thu Nov 3 01:25:17 CDT 2011


On Wed, Nov 2, 2011 at 8:20 PM, Benjamin Root <ben.root@ou.edu> wrote:
> On Wednesday, November 2, 2011, Nathaniel Smith <njs@pobox.com> wrote:
>> By R compatibility, I specifically had in mind in-memory
>> compatibility. rpy2 provides a more-or-less seamless within-process
>> interface between R and Python (and specifically lets you get numpy
>> views on arrays returned by R functions), so if we can make this work
>> for R arrays containing NA too then that'd be handy. (The rpy2 author
>> requested this in the last discussion here:
>> http://mail.scipy.org/pipermail/numpy-discussion/2011-June/057084.html)
>> When it comes to disk formats, then this doesn't matter so much, since
>> IO routines have to translate between different representations all
>> the time anyway.
>
> Interesting, but I still have to wonder if that should be on the wishlist
> for MISSING.  I guess it would matter by knowing whether people would be
> fully converting from R or gradually transitioning from it?  That is
> something that I can't answer.

Well, I'm one of the people who would use it, so yeah :-). I've been
trying to standardize my code on Python for a while now, but there's a
ton of statistical tools that are only really available through R, and
that will remain true for a while yet. So I use rpy2 when I have to.

>> I take the replacement of my line about MISSING disallowing unmasking
>> and your line about MISSING assignment being destructive as basically
>> expressing the same idea. Is that fair, or did you mean something
>> else?
>
> I am someone who wants to get to the absolute core of ideas. Also, this
> expression cleanly delineates the differences as binary.
>
> By expressing it this way, we also shy away from implementation details. For
> example, Unmasking can be programmatically prevented for MISSING while it
> could be implemented by other indirect means for IGNORE. Not that those are
> the preferred ways, only that the phrasing is more flexible and exacting.
>
>>
>> Finally, do you think that people who want IGNORED support care about
>> having a convenient API for masking/unmasking values? You removed that
>> line, but I don't know if that was because you disagreed with it, or
>> were just trying to simplify.
>
> See previous.

I like getting to the core of things too, but unless there's actual
disagreement, then I think even less central points are still worth
noting :-). I've tried editing things a bit to make the
compare/contrast clearer based on your comments, and put it up here:
   https://github.com/njsmith/numpy/wiki/NA-discussion-status

Maybe it would be better to split each list into core idea versus
extra niceties or something? I'm not sure.

-- Nathaniel


More information about the NumPy-Discussion mailing list