[Numpy-discussion] Optimization of loops
Charles R Harris
charlesr.harris@gmail....
Thu Oct 23 12:52:16 CDT 2008
On Wed, Oct 22, 2008 at 10:21 AM, Pierre Yger <yger@unic.cnrs-gif.fr> wrote:
> Hi all,
>
> This is my first mail to the mailing list, and I would like to know if
> anybody
> has a great idea about the use or not of Numpy and loops in Python.
>
> So here is my problem : I've a large list of tuple (id, time),
> id being integer between [0, ..., N] and time float values.
> I want to have a mysort() function that will be able to explode this list
> into
> N lists of differents sizes, that will contained the times associated to
> each
> id.
>
> Example:
>
> >> spikes = [(0, 2.3),(1, 5.6),(3, 2.5),(0, 5.2),(3, 10.2),(2, 16.2)]
>
> mysort(spikes)
>
> should return:
>
> [[2.3, 5.2], [5.6], [16.2], [2.5, 10.2]]
>
> Intuitively, the simplest way to do that is to append elements while going
> through all the tuples of the main list (called spikes) to empty lists:
>
> res = [[] for i in xrange(N)]
>
> for id, time in my_list:
> res[id].append(time)
>
> But this loop seems to be incredibly slow for large lists !
>
> A faster way (after having performed some profiling) seems to do:
> spikes = numpy.array(spikes) # Convert the list into a numpy array
> res = []
> for id in xrange(N):
> res.append(spikes[spikes[:,0] == id, 1]) # Use Numpy indexes
>
> Nevertheless, this is still rather slow. Does anybody have any idea about a
> faster way to do this ? Is there a Numpy function that could be used ?
>
If you want to stick to lists you can try something like
In [1]: import bisect
In [2]: spikes = [(0, 2.3),(1, 5.6),(3, 2.5),(0, 5.2),(3, 10.2),(2, 16.2)]
In [3]: spikes.sort()
In [4]: ind = [i for i,j in spikes]
In [5]: tim = [j for i,j in spikes]
In [6]: mylist = []
In [7]: for i in xrange(4) :
...: beg = bisect.bisect_left(ind,i)
...: end = bisect.bisect_right(ind,i)
...: mylist.append(tim[beg:end])
...:
In [8]: mylist
Out[8]:
[[2.2999999999999998, 5.2000000000000002],
[5.5999999999999996],
[16.199999999999999],
[2.5, 10.199999999999999]]
There is a numpy version of this also which might be faster if your lists
are really huge.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20081023/233bb527/attachment.html
More information about the Numpy-discussion
mailing list