[Numpy-discussion] missing data discussion round 2
Mon Jun 27 17:24:03 CDT 2011
On Jun 27, 2011, at 9:59 PM, firstname.lastname@example.org wrote:
> Just a question how things would work with the new model.
> How can you implement the "use" keyword from R's cov (or cor), with
> minimal data copying
> I think the basic masked array version would (or does) just assign 0
> to the missing values calculate the covariance or correlation and then
> correct with the correct count.
Basically, yes. Basic operations have a generic internal fill value (0 for sum/subtraction, 1 for multiplication/division), then you just have to correct by the count.
> especially I'm interested in the complete.obs (drop any rows that
> contains a NA) case
In numpy.ma, there are functions to drop rows/columns that contain a masked value (they are in numpy.ma.extras, if I recall correctly): just filter your data by these functions before parsing it to np.cov. That's the kind of trivial example that is probably not worth overloading a function with optional parameters for.
More information about the NumPy-Discussion