[Numpy-discussion] stumped numpy user seeks help

Sven Schreiber svetosch at gmx.net
Wed Aug 30 07:31:50 CDT 2006

Mathew Yeates schrieb:
> My head is about to explode.
> I have an M by N array of floats. Associated with the columns are 
> character labels
> ['a','b','b','c','d','e','e','e']  note: already sorted so duplicates 
> are contiguous
> I want to replace the 2 'b' columns with the sum of the 2 columns. 
> Similarly, replace the 3 'e' columns with the sum of the 3 'e' columns.
> The resulting array still has M rows but less than N columns. Anyone? 
> Could be any harder than Sudoku.

I don't have time for this ;-) , but I learnt something useful along the

import numpy as n
m = n.ones([2,6])
a = ['b', 'c', 'c', 'd', 'd', 'd']

startindices = set([a.index(x) for x in a])
out = n.empty([m.shape[0], 0])
for i in startindices:
    temp = n.mat(m[:, i : i + a.count(a[i])]).sum(axis = 1)
    out = n.hstack([out, temp])

print out

Not sure if axis = 1 is needed, but until the defaults have settled a
bit it can't hurt. You need python 2.4 for the built-in <set>, and <out>
will be a numpy matrix, use <asarray> if you don't like that. But here
it's really nice to work with matrices, because otherwise .sum() will
give you a 1-d array sometimes, and that will suddenly look like a row
to <hstack> (instead of a nice column vector) and wouldn't work --
that's why matrices are so great and everybody should be using them ;-)


More information about the Numpy-discussion mailing list