[Numpy-discussion] strange divergence in performance
Ernest Adrogué
eadrogue@gmx....
Wed Jan 20 15:56:40 CST 2010
Hi,
I have a function where an array of integers (1-d) is compared
element-wise to an integer using the greater-than operator.
I noticed that when the integer is 0 it takes about 75% more time
than when it's 1 or 2. Is there an explanation?
Here is a stripped-down version which does (sort of)show what I say:
def filter_array(array, f1, f2, flag=False):
if flag:
k = 1
else:
k = 0
m1 = reduce(np.add, [(array['f1'] == i).astype(int) for i in f1]) > 0
m2 = reduce(np.add, [(array['f2'] == i).astype(int) for i in f2]) > 0
mask = reduce(np.add, (i.astype(int) for i in (m1, m2))) > k
return array[mask]
Now let's create an array with two fields:
a = np.array(zip( np.random.random_integers(0,10,size=5000), np.random.random_integers(0,10,size=5000)), dtype=[('f1',int),('f2',int)])
Now call the function with flag=True and flag=False, and see what happens:
In [29]: %timeit filter_array(a, (6,), (0,), flag=False)
1000 loops, best of 3: 536 us per loop
In [30]: %timeit filter_array(a, (6,), (0,), flag=True)
1000 loops, best of 3: 245 us per loop
In this example the difference seems to be 1:2. In my program
is 1:4. I am at a loss about what causes this.
Bye.
More information about the NumPy-Discussion
mailing list