[SciPy-User] masking an array ends up flattening it
Zachary Pincus
zachary.pincus@yale....
Wed Feb 29 10:50:26 CST 2012
> Hi Zach, thanks a lot. I should know by now that naive expectations that are not met in numpy are generally so for lack of generalization! Your example makes perfect sense.
> My use case is a covariance matrix that has the dimension of all the parameters available, but some of them are fix in a fit, and I have a bool array that tells me which parameters are fixed. I then would like to "extract" the covariance matrix of the free parameters.
>
> I would rather go for masking and then reshaping than fancy indexing, which if too fancy start scaring me :)
> Of course if there is a clean solution, I am all ears.
OK, so you have a list of parameter indices that are "good" and you want to get the sub-matrix out corresponding to just the rows and columns at those indices? E.g.:
a = numpy.arange(25).reshape((5,5))
print a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
Then, say you want to get the sub-matrix of 'a' corresponding to rows/columns 1 and 3? Is this equivalent to what you need to do?
That is, you want the following:
array([[ 6, 8],
[16, 18]])
For this you might think to do the following:
a[[1,3], [1,3]]
but this returns 'array([ 6, 18])' -- you have pulled out a flat list of two elements, at indices [1,1] and [3,3]... This sort of fancy indexing is VERY useful in many cases, but not the case you want, which is more like a "cross product" sort of indexing problem.
It turns out that what you really want is:
a[ [[1,1],[3,3]], [[1,3],[1,3]] ]
which yields:
array([[ 6, 8],
[16, 18]])
This makes sense -- you pass in a two 2D arrays, one containing the x-coords and one the y-coords, and you get out a 2D array of the same shape.
Perhaps-insanely, the above can be simplified to:
a[ [[1],[3]], [[1,3]] ]
If you understand numpy broadcasting rules, you may see how:
[[1],[3]], [[1,3]]
broadcasts to be the same as:
[[1,1],[3,3]], [[1,3],[1,3]]
Fortunately, all of this mind-bending stuff is can be done behind the scenes with a cross-product indexing helper function:
a[ numpy.ix_([1,3], [1,3]) ]
takes care of it for you, and gives the desired
array([[ 6, 8],
[16, 18]])
This is all pretty advanced-sounding stuff... but most of it's laid out in sections 5 and 6 of the tentative tutorial:
http://www.scipy.org/Tentative_NumPy_Tutorial
You might also want to peruse Stéfan's advanced numpy tutorial -- the broadcasting and indexing sections are really useful.
http://mentat.za.net/numpy/numpy_advanced_slides/
Zach
> thanks again,
> johann
>
> On 02/28/2012 11:35 PM, Zachary Pincus wrote:
>> Hi Johann,
>>
>>> In [146]: mask
>>> Out[146]:
>>> array([[ True, True, True, False],
>>> [ True, True, True, False],
>>> [ True, True, True, False],
>>> [False, False, False, False]], dtype=bool)
>>>
>>> Naively, I thought I would end up with a (3,3) shaped array when
>>> applying the mask to m
>>
>> So that would make some sense for the above mask, but obviously doesn't generalize... what shape output would you expect if 'mask' looked like the following?
>>
>> array([[ True, True, True, False],
>> [ True, True, True, False],
>> [ True, True, True, False],
>> [False, False, False, True]], dtype=bool)
>>
>> Flattening turns out to be the most-sensible general-case thing to do. Fortunately, this is generally not a problem, because often one winds up doing things like:
>> a[mask] = b[mask]
>> where a and b can both be n-dimensional, and the fact that you go through a flattened intermediate is no problem.
>>
>> If, on the other hand, your task requires slicing square regions out of arrays, you could do that directly by other sorts of fancy-indexing or using programatically-generated slice objects, or some such. Can you describe the overall task? Perhaps then someone could suggest the "idiomatic numpy" solution?
>>
>> Zach
>>
>>
>>
>>> , but instead I get :
>>>
>>> In [147]: m[mask]
>>> Out[147]:
>>> array([ 1.82243247e-23, -5.53103453e-14, 4.32071039e-13,
>>> -5.52425949e-14, 6.26697129e-02, -5.12076585e-02,
>>> 4.31598429e-13, -5.12102340e-02, 6.27539118e-02])
>>>
>>> In [148]: m[mask].shape
>>> Out[148]: (9,)
>>>
>>> Is there another way to proceed and get directly the (3,3) shaped masked
>>> array, or do I need to reshape it by hand?
>>>
>>> thanks a lot in advance,
>>> Johann
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
More information about the SciPy-User
mailing list