[SciPy-User] faster nonzero indices
Wed Oct 21 02:55:22 CDT 2009
A Wednesday 21 October 2009 06:02:53 Felix Schlesinger escrigué:
> Is there a faster way to do:
> foo = scipy.nonzero(bar > 1)
> where bar is a 1d ndarray of type 'int32'
> i.e. to get all indices of an array for which a condition is true.
> Since in this case the arrays are quite large and the condition is only
> true for few items creating a long boolean array and then passing over it
> again to find non zero entries seems inefficient.
If the number of elements that evaluates the condition to true is effectively
small, and you can afford to have a precomputed array with indexes in memory
(typically, an `arange()`) you can try with numexpr :
In : import numpy as np
In : import numexpr as ne
In : bar = np.random.randint(0,1e6,1e6).astype('int32')
In : timeit np.where(bar > 999000)
100 loops, best of 3: 12.1 ms per loop
In : idx = np.arange(len(bar))
In : timeit idx[ne.evaluate('where(bar > 999000, 1, 0)').astype('bool')]
100 loops, best of 3: 7.68 ms per loop
which is more than 1.5x times faster than the numpy counterpart.
Even if you have to compute idx each time, the above approach is faster than
In : timeit np.arange(len(bar))[ne.evaluate('where(bar > 999000, 1,
100 loops, best of 3: 11 ms per loop
although in that case, just by a meager 10%.
More information about the SciPy-User