[Numpy-discussion] missing data discussion round 2

Matthew Brett matthew.brett@gmail....
Tue Jun 28 16:52:01 CDT 2011


On Tue, Jun 28, 2011 at 5:38 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
> Nathaniel, an implementation using masks will look *exactly* like an
> implementation using na-dtypes from the user's point of view. Except that
> taking a masked view of an unmasked array allows ignoring values without
> destroying or copying the original data. The only downside I can see to an
> implementation using masks is memory and disk storage, and perhaps memory
> mapped arrays. And I rather expect the former to solve itself in a few
> years, eight gigs is becoming a baseline for workstations and in a couple of
> years I expect that to be up around 16-32, and a few years after that.... In
> any case we are talking 12% - 25% overhead, and in practice I expect it
> won't be quite as big a problem as folks project.

Or, in the case of 16 bit integers, 50% memory overhead.

I honestly find it hard to believe that I will not care about memory
use in the near future, and I don't think it's wise to make decisions
on that assumption.



More information about the NumPy-Discussion mailing list