[Numpy-discussion] Need faster equivalent to digitize

Nadav Horesh nadavh@visionsense....
Thu Apr 15 01:34:36 CDT 2010


import numpy as N
N.repeat(N.arange(len(a)), a)

  Nadav

-----Original Message-----
From: numpy-discussion-bounces@scipy.org on behalf of Peter Shinners
Sent: Thu 15-Apr-10 08:30
To: Discussion of Numerical Python
Subject: [Numpy-discussion] Need faster equivalent to digitize
 
I am using digitize to create a list of indices. This is giving me 
exactly what I want, but it's terribly slow. Digitize is obviously not 
the tool I want for this case, but what numpy alternative do I have?

I have an array like np.array((4, 3, 3)). I need to create an index 
array with each index repeated by the its value: np.array((0, 0, 0, 0, 
1, 1, 1, 2, 2, 2)).

 >>> a = np.array((4, 3, 3))
 >>> b = np.arange(np.sum(a))
 >>> c = np.digitize(b, a)
 >>> print c
[0 0 0 0 1 1 1 2 2 2]

On an array where a.size==65536 and sum(a)==65536 this is taking over 6 
seconds to compute. As a comparison, using a Python list solution runs 
in 0.08 seconds. That is plenty fast, but I would guess there is a 
faster Numpy solution that does not require a dynamically growing 
container of PyObjects ?

 >>> a = np.array((4, 3, 3))
 >>> c = []
 >>> for i, v in enumerate(a):
...     c.extend([i] * v)


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 3282 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100415/e3354ea4/attachment.bin 


More information about the NumPy-Discussion mailing list