[SciPy-user] unique, sort, sortrows
David M. Kaplan
David.Kaplan@ird...
Sun Jul 27 07:43:09 CDT 2008
Hi,
Thanks for the very helpful comments.
Regarding Gael's comment, the problem with mgrid and ogrid (at least in
the version of numpy I use: 1.1.0-3) is that it currently only accepts
standard indexing so you can't use it with non-uniform values. I use
this a lot when I have a model I want to run with a series of parameter
values, not all of which are uniformly spaced. For example, the
following fails:
mgrid[:5,[1,7,8]]
I would like this to give something equivalent to
meshgrid(arange(5),[1,7,8]) (up to a transpose operation). Similarly
for more input arguments. I haven't looked at the numpy source code,
but it would seem that it shouldn't be too hard to add this
functionality as it already exists for r_ and c_:
r_[:5,[1,7,8]] # no problem here
In response to Robert's comments, I looked a bit at lexsort and didn't
immediately see how it could fix my problem of sorting rows of a matrix
because I didn't really understand it. Finally I figured out that the
following appears to do the trick:
I = lexsort(a[:,-1::-1].T)
b = a[I,:]
As this is a bit tricky for someone to figure out, perhaps a helper
function called sortrows would be useful in numpy? Also, an equivalent
of unique(a,'rows') would still be very useful.
As for [Y,I,J] = unique(X), yes Y=X[I] and X=Y[J]. Your fix would help
me out, though it would be nice to specify the call signature in the
help of the new version of unique1d (it took me a while to figure out
that it was I,Y and not Y,I). I think it would also be useful to
propogate these changes to the other arraysetops functions. In
particular, I use indexes returned by matlab's intersect command often:
[C,IA,IB] = intersect(A,B)
A use case for this is suppose you have a sparse dataset at x,y points
that is "on a grid", but lacking some of the points (e.g., instrument
only returns points that had valid data). A simple way to solve this
would be (using a few suggested changes to numpy/scipy):
[X,Y] = mgrid[ unique(pts[:,0]), unique(pts[:,1]) ]
s = X.shape
newData = tile( NaN, s ).flatten()
p,IA,IB = intersect( c_[X.flatten(),Y.flatten()], pts, rows=True )
newData[IA] = Data[IB]
newData.reshape(s)
There may be other concise ways to solve this, but this one seems fairly
efficient.
Thanks again.
Cheers,
David
--
**********************************
David M. Kaplan
Charge de Recherche 1
Institut de Recherche pour le Developpement
Centre de Recherche Halieutique Mediterraneenne et Tropicale
av. Jean Monnet
B.P. 171
34203 Sete cedex
France
Phone: +33 (0)4 99 57 32 27
Fax: +33 (0)4 99 57 32 95
http://www.ur097.ird.fr/team/dkaplan/index.html
**********************************
More information about the SciPy-user
mailing list