[Numpy-discussion] Memory order of array copies

Nathaniel Smith njs@pobox....
Sun Sep 30 13:17:42 CDT 2012


There are three basic Python APIs to copy an array in numpy:

  a.copy(): has always returned a C-contiguous array by default. has
always taken an order= argument, which defaults to "C".
  np.array(a, copy=True): by default, produces an array with whatever
memory ordering that 'a' had. Can also specify order="C", "F" to get C
or Fortran contiguity instead.
  np.copy(a): has always been a simple alias for np.array(a,
copy=True), which means that it also preserves memory ordering. BUT in
current master and the 1.7 betas, an extra argument order= has been
added, and this has been set to default to "C" ordering.

The extra argument and behavioural change occurred in 0e1a4e95. It's
not clear why; the change isn't mentioned in the commit message. The
change has to be reverted for 1.7, at least, because it caused
regressions in scikit-learn (and presumably other packages too).

So the question is, what *should* these interface look like. Then we
can figure out what kind of transition scheme is needed, if any.

My gut reaction is that if we want to change this at all from it's
pre-1.7 status quo, it would be the opposite of the change that was
made in master... I'd expect np.copy(a) and a.copy() to return arrays
that are as nearly-identical to 'a' as possible, unless I explicitly
requested something different by passing an order= argument.

But, I bet there is code out there that's doing something like:
  my_fragile_C_function_wrapper(a.copy())
when it really should be doing something more explicit like
  my_fragile_C_function_wrapper(np.array(a, copy=False, order="C", dtype=float))
i.e., they're depending on the current behaviour where a.copy()
normalizes order.

I don't see any way to detect these cases and issue a proper warning,
so we may not be able to change this at all. Any ideas? Is there
anything better to do than simply revert np.copy() to its traditional
behaviour and accept that np.copy(a) and a.copy() will continue to
have different semantics indefinitely?

-n


More information about the NumPy-Discussion mailing list