[SciPy-user] Ordering and Counting the Repetitions of the Rows of a Matrix

Lorenzo Isella lorenzo.isella@gmail....
Thu Jun 25 06:01:32 CDT 2009


Dear All,
I dug up an old post of mine to this list (the problem was mainly how to 
get rid of multiple rows in a matrix while counting the multiple 
occurrences of each row).
Now, the problem is slightly more complex

The matrix is of the kind

A= 1 2
       2 3
       9 9
       4 4
       1 2
       3 2

but this time, you consider the row with entries (2 3) equal to the one 
with entries (3 2), i.e. this time the ordering of elements in a row 
does not matter.
How can I still calculate the repetitions of each row in the sense 
explained above and obtain the 'repetition-free' matrix?

Furthermore, suppose that you have the matrix

B= 2 1 2 4
      4 2 3 9
      8 9 9 7
      5 4 4 1
      6 1 2 2
      4 3 2 9

Now, you have extra elements with respect to matrix A, but you consider 
two rows equal if the first and forth entry are coincident and the 
second and third entry are the same numbers or are swapped (like in the 
case of matrix A). E.g. the second and last row of matrix B would be 
considered equal in this case. You still want the number of occurrences 
of each row (with the new concept of equal rows) and the repetition-free 
matrix.
Any ideas about how this could be efficiently implemented?
Many thanks

Lorenzo
> Date: Sun, 27 Jul 2008 15:46:29 -0400 From: "Warren Weckesser" 
> <warren.weckesser@gmail.com> Subject: Re: [SciPy-user] Ordering and 
> Counting the Repetitions of the Rows of a Matrix To: "SciPy Users 
> List" <scipy-user@scipy.org> Message-ID: 
> <114880320807271246x1c922e7cg9539684fbad7bed9@mail.gmail.com> 
> Content-Type: text/plain; charset="iso-8859-1" Lorenzo, Given a matrix 
> A like you showed, here is one way to find (and count) the unique 
> rows: ---------- d = {} for r in A: t = tuple(r) d[t] = d.get(t,0) + 1 
> # The dict d now has the counts of the unique rows of A. B = 
> numpy.array(d.keys()) # The unique rows of A C = 
> numpy.array(d.values()) # The counts of the unique rows ---------- For 
> a large number of rows (e.g. 10000), this appears to be significantly 
> faster than the code that David Kaplan suggested in his email earlier 
> today. Regards, Warren On Sun, Jul 27, 2008 at 12:17 PM, Lorenzo 
> Isella <lorenzo.isella@gmail.com>wrote:
>> > Dear All,
>> > Consider an Nx2 matrix of the kind:
>> >
>> > A=   1 2
>> >       3 13
>> >       1  2
>> >       6  8
>> >       3 13
>> >       2  9
>> >       1  1
>> >
>> >
>> > The first entry in each row is always smaller or equal than the second
>> > entry in the same row.
>> > Now there are two things I would like to do with this A matrix:
>> > (1) With a sort of n.unique1d (but have not been very successful yet),
>> > let each row of A appear only once (i.e. get rid of the repetitions).
>> > Therefore one should obtain the matrix:
>> > B=   1 2
>> >       3 13
>> >       6  8
>> >       2  9
>> >       1  1
>> >
>> > (2) At the same time, efficiently count how many times each row of B
>> > appeared in A. I would like to get a C vector counting them as:
>> >
>> > C=   2
>> >       2
>> >       1
>> >       1
>> >       1
>> >
>> >
>> > Any suggestions about an efficient way of achieving this?
>> > Many thanks
>> >
>> > Lorenzo
>> > ______________________



More information about the SciPy-user mailing list