[Numpy-discussion] missing data discussion round 2

Jason Grout jason-sage@creativetrax....
Tue Jun 28 17:40:53 CDT 2011


On 6/28/11 5:20 PM, Matthew Brett wrote:
> Hi,
>
> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith<njs@pobox.com>  wrote:
> ...
>> (You might think, what difference does it make if you *can* unmask an
>> item? Us missing data folks could just ignore this feature. But:
>> whatever we end up implementing is something that I will have to
>> explain over and over to different people, most of them not
>> particularly sophisticated programmers. And there's just no sensible
>> way to explain this idea that if you store some particular value, then
>> it replaces the old value, but if you store NA, then the old value is
>> still there.
>
> Ouch - yes.  No question, that is difficult to explain.   Well, I
> think the explanation might go like this:
>
> "Ah, yes, well, that's because in fact numpy records missing values by
> using a 'mask'.   So when you say `a[3] = np.NA', what you mean is,
> 'a._mask = np.ones(a.shape, np.dtype(bool); a._mask[3] = False`"
>
> Is that fair?

Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys 
the idea that the entry is still there, but we're just ignoring it.  Of 
course, that goes against common convention, but it might be easier to 
explain.

Thanks,

Jason



More information about the NumPy-Discussion mailing list