[SciPy-user] Ordering and Counting the Repetitions of the Rows of a Matrix
Lorenzo Isella
lorenzo.isella@gmail....
Thu Jun 25 06:01:32 CDT 2009
Dear All,
I dug up an old post of mine to this list (the problem was mainly how to
get rid of multiple rows in a matrix while counting the multiple
occurrences of each row).
Now, the problem is slightly more complex
The matrix is of the kind
A= 1 2
2 3
9 9
4 4
1 2
3 2
but this time, you consider the row with entries (2 3) equal to the one
with entries (3 2), i.e. this time the ordering of elements in a row
does not matter.
How can I still calculate the repetitions of each row in the sense
explained above and obtain the 'repetition-free' matrix?
Furthermore, suppose that you have the matrix
B= 2 1 2 4
4 2 3 9
8 9 9 7
5 4 4 1
6 1 2 2
4 3 2 9
Now, you have extra elements with respect to matrix A, but you consider
two rows equal if the first and forth entry are coincident and the
second and third entry are the same numbers or are swapped (like in the
case of matrix A). E.g. the second and last row of matrix B would be
considered equal in this case. You still want the number of occurrences
of each row (with the new concept of equal rows) and the repetition-free
matrix.
Any ideas about how this could be efficiently implemented?
Many thanks
Lorenzo
> Date: Sun, 27 Jul 2008 15:46:29 -0400 From: "Warren Weckesser"
> <warren.weckesser@gmail.com> Subject: Re: [SciPy-user] Ordering and
> Counting the Repetitions of the Rows of a Matrix To: "SciPy Users
> List" <scipy-user@scipy.org> Message-ID:
> <114880320807271246x1c922e7cg9539684fbad7bed9@mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1" Lorenzo, Given a matrix
> A like you showed, here is one way to find (and count) the unique
> rows: ---------- d = {} for r in A: t = tuple(r) d[t] = d.get(t,0) + 1
> # The dict d now has the counts of the unique rows of A. B =
> numpy.array(d.keys()) # The unique rows of A C =
> numpy.array(d.values()) # The counts of the unique rows ---------- For
> a large number of rows (e.g. 10000), this appears to be significantly
> faster than the code that David Kaplan suggested in his email earlier
> today. Regards, Warren On Sun, Jul 27, 2008 at 12:17 PM, Lorenzo
> Isella <lorenzo.isella@gmail.com>wrote:
>> > Dear All,
>> > Consider an Nx2 matrix of the kind:
>> >
>> > A= 1 2
>> > 3 13
>> > 1 2
>> > 6 8
>> > 3 13
>> > 2 9
>> > 1 1
>> >
>> >
>> > The first entry in each row is always smaller or equal than the second
>> > entry in the same row.
>> > Now there are two things I would like to do with this A matrix:
>> > (1) With a sort of n.unique1d (but have not been very successful yet),
>> > let each row of A appear only once (i.e. get rid of the repetitions).
>> > Therefore one should obtain the matrix:
>> > B= 1 2
>> > 3 13
>> > 6 8
>> > 2 9
>> > 1 1
>> >
>> > (2) At the same time, efficiently count how many times each row of B
>> > appeared in A. I would like to get a C vector counting them as:
>> >
>> > C= 2
>> > 2
>> > 1
>> > 1
>> > 1
>> >
>> >
>> > Any suggestions about an efficient way of achieving this?
>> > Many thanks
>> >
>> > Lorenzo
>> > ______________________
More information about the SciPy-user
mailing list