[Numpy-discussion] stumped numpy user seeks help
svetosch at gmx.net
Wed Aug 30 07:31:50 CDT 2006
Mathew Yeates schrieb:
> My head is about to explode.
> I have an M by N array of floats. Associated with the columns are
> character labels
> ['a','b','b','c','d','e','e','e'] note: already sorted so duplicates
> are contiguous
> I want to replace the 2 'b' columns with the sum of the 2 columns.
> Similarly, replace the 3 'e' columns with the sum of the 3 'e' columns.
> The resulting array still has M rows but less than N columns. Anyone?
> Could be any harder than Sudoku.
I don't have time for this ;-) , but I learnt something useful along the
import numpy as n
m = n.ones([2,6])
a = ['b', 'c', 'c', 'd', 'd', 'd']
startindices = set([a.index(x) for x in a])
out = n.empty([m.shape, 0])
for i in startindices:
temp = n.mat(m[:, i : i + a.count(a[i])]).sum(axis = 1)
out = n.hstack([out, temp])
Not sure if axis = 1 is needed, but until the defaults have settled a
bit it can't hurt. You need python 2.4 for the built-in <set>, and <out>
will be a numpy matrix, use <asarray> if you don't like that. But here
it's really nice to work with matrices, because otherwise .sum() will
give you a 1-d array sometimes, and that will suddenly look like a row
to <hstack> (instead of a nice column vector) and wouldn't work --
that's why matrices are so great and everybody should be using them ;-)
More information about the Numpy-discussion