[Numpy-discussion] Problems with Numexpr and discontiguous arrays
oliphant at ee.byu.edu
Wed Oct 4 13:21:43 CDT 2006
Tim Hochberg wrote:
>David M. Cooke wrote:
>>On Wed, 04 Oct 2006 10:19:08 -0700
>>Tim Hochberg <tim.hochberg at ieee.org> wrote:
>>>Ivan Vilata i Balaguer wrote:
>>>>It seemed that discontiguous arrays worked OK in Numexpr since r1977 or
>>>>so, but I have come across some alignment or striding problems which can
>>>>be seen with the following code::
>>>I looked at this just a little bit and clearly this bit from interp_body
>>>cannot work in the presence of recor arrays:
>>> intp sf1 = sb1 / sizeof(double); \
>>> #define f1 ((double *)x1)[j*sf1]
>>>There are clearly some assumptions that sb1 is evenly divisible by
>>>sizeof(double). Blech!. This is likely my fault, and I expect it won't
>>>be too horrible to fix, but I don't know that I'll have time immediately.
>>My thinking is that this should be handled by a copy, so that the opcodes
>>always work on contiguous data. The copy can be another opcode. One advantage
>>of operating on contiguous data is that it's easier to use the processor's
>>vector instructions, if applicable.
>That would be easy to do. Right now the opcodes should work correctly on
>data that is spaced in multiples of the itemsize on the last axis. Other
>arrays are copied (no opcode required, it's embedded at the top of
>interp_body lines 64-80). The record array case apparently slips through
>the cracks when we're checking whether an array is suitable to be used
>correctly (interpreter.c 1086-1103). It would certainly not be any
>harder to only allow contiguous arrays than to correctly deal with
>record arrays. Only question I have is whether the extra copy will
>overwhelm the savings of that operating on contiguous data gives.
With record arrays you have to worry about alignment issues. The most
complicated part of the ufunc code is to handle that.
The usual approach is to copy (and possibly byte-swap at least the axis
you are working on) over to a buffer (the copyswapn functions will do
that using a pretty fast approach for each data-type). This is
ultimately how the ufuncs work (though the buffer-size is fixed so the
data is copied and operated on in chunks).
More information about the Numpy-discussion