[SciPy-user] Conditionally adding items to a list

Roger Herikstad roger.herikstad@gmail....
Tue Aug 26 03:59:44 CDT 2008


Hi,
 Thanks! That gave med a speed-up of about 5 times over my current code.

~ Roger

On Tue, Aug 26, 2008 at 3:29 PM, David Warde-Farley <dwf@cs.toronto.edu> wrote:
>
> On 26-Aug-08, at 2:03 AM, Roger Herikstad wrote:
>
>> Hi all,
>> I have a prolem that I was wondering if anyone could come up with a
>> good solution for. I basically have to lists of number and I want to
>> add elements from one list to the other as long as the difference
>> between the added element and all elements already in that list
>> exceeds a certain threshold. The code I came up with is
>>
>> map(times1.append,ifilter(lambda(x):
>> numpy.abs(x-numpy.array(times1)).min()>1000, times2))
>
>
> It seems like since this is such a sequentially dependent problem
> (whether you add element N of times2 depends on element N-1, N-2...
> etc.) it'll be somewhat difficult to gain a significant speedup. What
> you can do is avoid unnecessary copies, creating the array over and
> over, and perform certain comparisons only once. See below.
>
> ------------------- CUT HERE -------------------
> import numpy as N
>
> times1a = N.array(times1)
> times2a = N.array(times2)
>
> # We can eliminate anyone that isn't at least 1000 away from all the
> # initial elements of times1a, right off the bat. Because of this
> initial
> # pass we can avoid repeatedly comparing against all the elements
> # in this list and focus on the ones we've just added.
> #
> # What this does is essentially do all the subtractions in parallel
> # by broadcasting to a 2D array and then taking the column min's;
> # this should be faster than a Python loop.
>
> candidates_idx = N.abs(times1a[:,None] - times2a).min(axis=0) > 1000
> times2a_candidates = times2a[candidates_idx]
>
> # Initialize a boolean array to keep track of the things we've added
> added = N.empty(times2a_candidates.shape, dtype=bool)
> added[:] = False
>
> # We'll always be adding the first one in the candidate list, since
> # we haven't added any others. The 'if' is just to make sure our code
> # doesn't error in the event of an empty array.
> if added.shape[0] > 0:
>     added[0] = True
>
> for i in xrange(times2a_candidates.shape[0]):
>     x = times2a_candidates[i]
>
>     # if x is 1000 away from every element from times2a we've already
>     # added, add it to the list by flagging it True
>
>     if N.all(N.abs(x - times2a_candidates[added]) > 1000):
>         added[i] = True
>
> # Finally, merge the two lists
> result = N.concatenate((times1a, times2a_candidates[added]))
> \
> ------------------- CUT HERE -------------------
>
> If you like, you can then turn it back into a Python list with
> result.tolist().
>
> Regards,
>
> David
> _______________________________________________
> SciPy-user mailing list
> SciPy-user@scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>


More information about the SciPy-user mailing list