[Numpy-discussion] Unnecessarily bad performance of elementwise operators with Fortran-arrays
Travis E. Oliphant
Thu Nov 8 22:44:11 CST 2007
David Cournapeau wrote:
> Travis E. Oliphant wrote:
>> Christopher Barker wrote:
>>> This discussion makes me wonder if the basic element-wise operations
>>> could (should?) be special cased for contiguous arrays, reducing them to
>>> simple pointer incrementing from the start to the finish of the data
>>> block. The same code would work for C and Fortran order arrays, and be
>>> pretty simple.
>>> This would address Hans' issue, no?
>>> It's a special case but a common one.
>> There is a special case for this already. It's just that the specific
>> operations he is addressing requires creation of output arrays that by
>> default are in C-order. This would need to change in order to take
>> advantage of the special case.
> For copy and array creation, I understand this, but for element-wise
> operations (mean, min, and max), this is not enough to explain the
> difference, no ? For example, I can understand a 50 % or 100 % time
> increase for simple operations (by simple, I mean one element operation
> taking only a few CPU cycles), because of copies, but a 5 fold time
> increase seems too big, no (mayb a cache problem, though) ? Also, the
> fact that mean is slower than min/max for both cases (F vs C) seems a
> bit counterintuitive (maybe cache effects are involved somehow ?).
> Again, I see huge differences between my Xeon PIV @ 3.2 Ghz and my
> pentium M @ 1.2 Ghz for those operations: pentium M gives more
> "intuitive results (and is almost as fast, and sometimes even faster
> than my Xeon for arrays which can stay in cache).
I wasn't talking about the min, mean, and max methods specifically.
These are all implemented with the reduce method of a ufunc.
But, there are special cases for the reduce method as well and so
relatively smooth pathways for optimization.
More information about the Numpy-discussion