[Numpy-discussion] NA masks in the next numpy release?

Charles R Harris charlesr.harris@gmail....
Mon Oct 24 09:54:57 CDT 2011


On Mon, Oct 24, 2011 at 8:40 AM, Charles R Harris <charlesr.harris@gmail.com
> wrote:

>
>
> On Sun, Oct 23, 2011 at 11:23 PM, Wes McKinney <wesmckinn@gmail.com>wrote:
>
>> On Sun, Oct 23, 2011 at 8:07 PM, Eric Firing <efiring@hawaii.edu> wrote:
>> > On 10/23/2011 12:34 PM, Nathaniel Smith wrote:
>> >
>> >> like. And in this case I do think we can come up with an API that will
>> >> make everyone happy, but that Mark's current API probably can't be
>> >> incrementally evolved to become that API.)
>> >>
>> >
>> > No one could object to coming up with an API that makes everyone happy,
>> > provided that it actually gets coded up, tested, and is found to be fast
>> > and maintainable.  When you say the API probably can't be evolved, do
>> > you mean that the underlying implementation also has to be redone?  And
>> > if so, who will do it, and when?
>> >
>> > Eric
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@scipy.org
>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>>
>> I personally am a bit apprehensive as I am worried about the masked
>> array abstraction "leaking" through to users of pandas, something
>> which I simply will not accept (why I decided against using numpy.ma
>> early on, that + performance problems). Basically if having an
>> understanding of masked arrays is a prerequisite for using pandas, the
>> whole thing is DOA to me as it undermines the usability arguments I've
>> been making about switching to Python (from R) for data analysis and
>> statistical computing.
>>
>
> The missing data functionality looks far more like R than numpy.ma.
>
>
For instance

In [8]: a = arange(5, maskna=1)

In [9]: a[2] = np.NA

In [10]: a.mean()
Out[10]: NA(dtype='float64')

In [11]: a.mean(skipna=1)
Out[11]: 2.0

In [12]: a = arange(5)

In [13]: b = a.view(maskna=1)

In [14]: a.mean()
Out[14]: 2.0

In [15]: b[2] = np.NA

In [16]: b.mean()
Out[16]: NA(dtype='float64')

In [17]: b.mean(skipna=1)
Out[17]: 2.0

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111024/59648e7a/attachment-0001.html 


More information about the NumPy-Discussion mailing list