[Numpy-discussion] alterNEP - was: missing data discussion round 2
Charles R Harris
Thu Jun 30 20:10:04 CDT 2011
On Thu, Jun 30, 2011 at 6:02 PM, Matthew Brett <email@example.com>wrote:
> On Thu, Jun 30, 2011 at 9:01 PM, Lluís <firstname.lastname@example.org> wrote:
> > Matthew Brett writes:
> >> Hi,
> >> On Thu, Jun 30, 2011 at 7:27 PM, Lluís <email@example.com> wrote:
> >>> Matthew Brett writes:
> >>> [...]
> >>>> I'm afraid, like you, I'm a little lost in the world of masking,
> >>>> because I only need the NAs. I was trying to see if I could come up
> >>>> with an API that picked up some of the syntactic convenience of NAs,
> >>>> without conflating NAs with IGNOREs. I guess we need some feedback
> >>>> from the 'NA & IGNORE Share the API' (NISA?) proponents to get an idea
> >>>> of what we've missed. @Mark, @Chuck, guys - what have we lost here by
> >>>> separating the APIs?
> >>> As I tried to convey on my other mail, separating both will force you
> >>> either:
> >>> * Make a copy of the array before passing it to another routine
> >>> the routine will assign np.NA but you still want the original data)
> >> You have an array 'arr'. The array does support NAs, but it doesn't
> >> have a mask. You want to pass ``arr`` to another routine ``func``.
> >> You expect ``func`` to set NAs into the data but you don't want
> >> ``func`` to modify ``arr`` and you don't want to copy ``arr`` either.
> >> You are saying the following:
> >> "with the fused API, I can make ``arr`` be a masked array, and pass it
> >> into ``func``, and know that, when func sets elements of arr to NA, it
> >> will only modify the mask and not the underlying data in ``arr``."
> > Yes.
> >> It does seem to me this is a very obscure case. First, ``func`` is
> >> modifying the array but you want an unmodified array back. Second,
> >> you'll have to do some view trick to recover the not-NA case to arr,
> >> when it comes back.
> > I know, the example is just silly and convoluted.
> >> It seems to me, that what ``func`` should do, if it wants you to be
> >> able to unmask the NAs, is to make a masked array view of ``arr``, and
> >> return that. And indeed the simplicity of the separated API
> >> immediately makes that clear - in my view at least.
> > I agree on this example. My only concern is on the API's ability to
> > foresee as most future use-cases as possible, without impacting
> > performance.
> But, of course, there's a great danger in trying to cover every
> possible use-case.
> My argument is that the kind of cases that you are describe are - I
> believe - very rare and are even a little difficult to make up. Is
> that fair?
> To my mind, the separate NA and IGNORE API is easier to understand and
> explain. If that isn't true, please do say, and say why - because
> that point is key.
I think the main problem is that they aren't separate, one takes place in a
view of an unmasked array, the other starts with a masked array. These
aren't 'different' in mechanism, they are just different in work flow. And I
think they fit in well with the view idea.
> If it is true that the separate API is clearer, then the benefit in
> terms of power and extensibility has to be large, in order to go for
> the fused API.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion