[Numpy-discussion] Unexpected reorganization of internal data
Chris Barker
chris.barker@noaa....
Tue Jan 31 11:23:58 CST 2012
On Tue, Jan 31, 2012 at 6:14 AM, Malcolm Reynolds
<malcolm.reynolds@gmail.com> wrote:
> Not exactly an answer to your question, but I can highly recommend
> using Boost.python, PyUblas and Ublas for your C++ vectors and
> matrices. It gives you a really good interface on the C++ side to
> numpy arrays and matrices, which can be passed in both directions over
> the language threshold with no copying.
or use Cython...
> If I had to guess I'd say sometimes when transposing numpy simply sets
> a flag internally to avoid copying the data, but in some cases (such
> as perhaps when multiplication needs to take place) the data has to be
> placed in a new object.
good guess:
> V = numpy.dot(R, U.transpose()).transpose()
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
>>> a.flags
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
>>> b = a.transpose()
>>> b.flags
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
so the transpose() simple re-arranges the strides to Fortran order,
rather than changing anything in memory.
np.dot() produces a new array, so it is C-contiguous, then you
transpose it, so you get a fortran-ordered array.
> Now when I call my C++ function from the Python side, all the data in V is printed, but it has been transposed.
as mentioned, if you are working with arrays in C++ (or fortran, orC,
or...) and need to count on the ordering of the data, you need to
check it in your extension code. There are utilities for this.
> However, if I do:
> V = numpy.array(U.transpose()).transpose()
right:
In [7]: a.flags
Out[7]:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In [8]: a.transpose().flags
Out[8]:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In [9]: np.array( a.transpose() ).flags
Out[9]:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
so the np.array call doesn't re-arrange the order if it doesn't need
to. If you want to force it, you can specify the order:
In [10]: np.array( a.transpose(), order='C' ).flags
Out[10]:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
(note: this does surprise me a bit, as it is making a copy, but there
you go -- if order matters, specify it)
In general, numpy does a lot of things for the sake of efficiency --
avoiding copies when it can, for instance -- this give efficiency and
flexibility, but you do need to be careful, particularly when
interfacing with the binary data directly.
-Chris
