[SciPy-user] A first proposal for dataset organization
Mon Sep 24 00:23:25 CDT 2007
Matt Knox wrote:
>> David (Huard) already highlighted one problem with my proposal (time
>> series representation). I would really be interested in comments about
>> using MaskedArrays to handle missing data (I've never used it myself),
>> and the use of record arrays for the data; for example, I can see cases
>> where record arrays may be a problem (if all your data are homogenous,
>> you cannot treat the data as a big numpy array), but I don't know if
>> this is significant.
> Well, there are tools in the sandbox that handle all this kind of stuff. The
> new maskedarray implementation in the sandbox has a "MaskedRecords" class
> which allows for missing values in record arrays. The timeseries package
> handles time series of various frequencies, and is a subclass of MaskedArray
> so it also handles missing values too. There is also a "TimeSeriesRecords"
> class which is a subclass of the "MaskedRecords" class. This would probably be
> a really nice way to represent a lot of this data, but it is hard to say
> when/if this stuff will move out of the sandbox and into the core numpy/scipy
This sounds great. I am a bit worried to depend on sandboxed packages,
though. My understanding, but I did not follow the discussion in
details, was that MaskedArrays would replace the current implementation
in numpy, right ?
> If you have specific questions about the maskedarray or timeseries module, or
> the current numpy.ma module, start up a new thread and I'll answer what I can,
> and I'm sure others can fill in any gaps.
Ok, I will take a look at those, because I am totally unfamiliar with those,
More information about the SciPy-user