[Numpy-svn] r8148 - trunk/doc
Sat Feb 20 12:08:28 CST 2010
Date: 2010-02-20 12:08:27 -0600 (Sat, 20 Feb 2010)
New Revision: 8148
3K: doc: update Py3K port documentation
--- trunk/doc/Py3K.txt 2010-02-20 18:08:14 UTC (rev 8147)
+++ trunk/doc/Py3K.txt 2010-02-20 18:08:27 UTC (rev 8148)
@@ -59,6 +59,15 @@
* Only unicode dtype field titles are included in fields dict.
+* :pep:`3118` buffer objects will behave differently from Py2 buffer objects
+ when used as an argument to `array(...)`, `asarray(...)`.
+ In Py2, they would cast to an object array.
+ In Py3, they cast similarly as objects having an
+ ``__array_interface__`` attribute, ie., they behave as if they were
+ an ndarray view on the data.
Check for any other changes ... This we want in the end to include
@@ -317,8 +326,8 @@
Py_TPFLAGS_HAVE_CLASS in the type flag.
PyBuffer usage is widely spread in multiarray:
@@ -335,33 +344,13 @@
for generic array scalars. The generic array scalar exporter, however,
doesn't currently produce format strings, which needs to be fixed.
-Currently, the format string and some of the memory is cached in the
-PyArrayObject structure. This is partly needed because of Python bug #7433.
Also some code also stops working when ``bf_releasebuffer`` is
defined. Most importantly, ``PyArg_ParseTuple("s#", ...)`` refuses to
return a buffer if ``bf_releasebuffer`` is present. For this reason,
the buffer interface for arrays is implemented currently *without*
defining ``bf_releasebuffer`` at all. This forces us to go through
-some additional contortions. But basically, since the strides and shape
-of an array are locked when references to it are held, we can do with
-a single allocated ``Py_ssize_t`` shape+strides buffer.
+some additional work.
-The buffer format string is currently cached in the ``dtype`` object.
-Currently, there's a slight problem as dtypes are not immutable --
-the names of the fields can be changed. Right now, this issue is
-just ignored, and the field names in the buffer format string are
-From the consumer side, the new buffer protocol is mostly backward
-compatible with the old one, so little needs to be done here to retain
-basic functionality. However, we *do* want to make use of the new
-features, at least in `multiarray.frombuffer` and maybe in `multiarray.array`.
-Since there is a native buffer object in Py3, the `memoryview`, the
-`newbuffer` and `getbuffer` functions are removed from `multiarray` in
-Py3: their functionality is taken over by the new `memoryview` object.
There are a couple of places that need further attention:
@@ -401,7 +390,10 @@
+ Produce PEP 3118 format strings for array scalar objects.
Is there a cleaner way out of the ``bf_releasebuffer`` issue? It
@@ -411,50 +403,90 @@
It seems we should submit patches to Python on this. At least "s#"
implementation on Py3 won't work at all, since the old buffer
- interface is no more present.
+ interface is no more present. But perhaps Py3 users should just give
+ up using "s#" in ParseTuple, and use the 3118 interface instead.
- Find a way around the dtype mutability issue.
+ Make ndarray shape and strides natively Py_ssize_t?
- Note that we cannot just realloc the format string when the names
- are changed: this would invalidate any existing buffer
- interfaces. And since we can't define ``bf_releasebuffer``, we
- don't know if there are any buffer interfaces present.
- One solution would be to alloc a "big enough" buffer at the
- beginning, and not change it after that. We could also make the
- strides etc. in the ``buffer_info`` structure static size. There's
- MAXDIMS present after all.
+There are two places in which we may want to be able to consume buffer
+objects and cast them to ndarrays:
- Take a second look at places that used PyBuffer_FromMemory and
- PyBuffer_FromReadWriteMemory -- what can be done with these?
+1) `multiarray.frombuffer`, ie., ``PyArray_FromAny``
+ The frombuffer returns only arrays of a fixed dtype. It does not
+ make sense to support PEP 3118 at this location, since not much
+ would be gained from that -- the backward compatibility functions
+ using the old array interface still work.
- Implement support for consuming new buffer objects.
- Probably in multiarray.frombuffer? Perhaps also in multiarray.array?
+ So no changes needed here.
+2) `multiarray.array`, ie., ``PyArray_FromAny``
- make ndarray shape and strides natively Py_ssize_t
+ In general, we would like to handle :pep:`3118` buffers in the same way
+ as ``__array_interface__`` objects. Hence, we want to be able to cast
+ them to arrays already in ``PyArray_FromAny``.
+ Hence, ``PyArray_FromAny`` needs additions.
+There are a few caveats in allowing :pep:`3118` buffers in
+a) `bytes` (and `str` on Py2) objects offer a buffer interface that
+ specifies them as 1-D array of bytes.
+ Previously ``PyArray_FromAny`` has cast these to 'S#' dtypes. We
+ don't want to change this, since will cause problems in many places.
+ We do, however, want to allow other objects that provide 1-D byte arrays
+ to be cast to 1-D ndarrays and not 'S#' arrays -- for instance, 'S#'
+ arrays tend to strip trailing NUL characters.
+So what is done in ``PyArray_FromAny`` currently is that:
+- Presence of :pep:`3118` buffer interface is checked before checking
+ for array interface. If it is present *and* the object is not
+ `bytes` object, then it is used for creating a view on the buffer.
+- We also check in ``discover_depth`` and ``_array_find_type`` for the
+ 3118 buffers, so that::
+ will treat the object similarly as it would handle an `ndarray`.
+ However, again, bytes (and unicode) have priority and will not be
+ handled as buffer objects.
+This amounts to possible semantic changes:
+- ``array(buffer)`` will no longer create an object array
+ ``array([buffer], dtype='O')``, but will instead expand to a view
+ on the buffer.
- Revise the decision on where to cache the format string -- dtype
- would be a better place for this.
+ Take a second look at places that used PyBuffer_FromMemory and
+ PyBuffer_FromReadWriteMemory -- what can be done with these?
There's some buffer code in numarray/_capi.c that needs to be addressed.
- Does altering the PyArrayObject structure require bumping the ABI?
+Since there is a native buffer object in Py3, the `memoryview`, the
+`newbuffer` and `getbuffer` functions are removed from `multiarray` in
+Py3: their functionality is taken over by the new `memoryview` object.
More information about the Numpy-svn