[Numpy-svn] r5730 - in trunk/doc: . neps

numpy-svn@scip... numpy-svn@scip...
Sat Aug 30 22:29:08 CDT 2008


Author: jarrod.millman
Date: 2008-08-30 22:28:56 -0500 (Sat, 30 Aug 2008)
New Revision: 5730

Added:
   trunk/doc/neps/datetime.rst
   trunk/doc/neps/npy-format.txt
   trunk/doc/neps/pep_buffer.txt
Removed:
   trunk/doc/npy-format.txt
   trunk/doc/pep_buffer.txt
Log:
moving and adding neps


Added: trunk/doc/neps/datetime.rst
===================================================================
--- trunk/doc/neps/datetime.rst	2008-08-31 03:21:01 UTC (rev 5729)
+++ trunk/doc/neps/datetime.rst	2008-08-31 03:28:56 UTC (rev 5730)
@@ -0,0 +1,354 @@
+====================================================================
+ A (second) proposal for implementing some date/time types in NumPy
+====================================================================
+
+:Author: Francesc Alted i Abad
+:Contact: faltet@pytables.com
+:Author: Ivan Vilata i Balaguer
+:Contact: ivan@selidor.net
+:Date: 2008-07-16
+
+
+Executive summary
+=================
+
+A date/time mark is something very handy to have in many fields where
+one has to deal with data sets.  While Python has several modules that
+define a date/time type (like the integrated ``datetime`` [1]_ or
+``mx.DateTime`` [2]_), NumPy has a lack of them.
+
+In this document, we are proposing the addition of a series of date/time
+types to fill this gap.  The requirements for the proposed types are
+two-folded: 1) they have to be fast to operate with and 2) they have to
+be as compatible as possible with the existing ``datetime`` module that
+comes with Python.
+
+
+Types proposed
+==============
+
+To start with, it is virtually impossible to come up with a single
+date/time type that fills the needs of every case of use.  So, after
+pondering about different possibilities, we have stick with *two*
+different types, namely ``datetime64`` and ``timedelta64`` (these names
+are preliminary and can be changed), that can have different resolutions
+so as to cover different needs.
+
+**Important note:** the resolution is conceived here as a metadata that
+  *complements* a date/time dtype, *without changing the base type*.
+
+Now it goes a detailed description of the proposed types.
+
+
+``datetime64``
+--------------
+
+It represents a time that is absolute (i.e. not relative).  It is
+implemented internally as an ``int64`` type.  The internal epoch is
+POSIX epoch (see [3]_).
+
+Resolution
+~~~~~~~~~~
+
+It accepts different resolutions and for each of these resolutions, it
+will support different time spans.  The table below describes the
+resolutions supported with its corresponding time spans.
+
++----------------------+----------------------------------+
+|     Resolution       |         Time span (years)        |
++----------------------+----------------------------------+
+|  Code |   Meaning    |                                  |
++======================+==================================+
+|   Y   |  year        |      [9.2e18 BC, 9.2e18 AC]      |
+|   Q   |  quarter     |      [3.0e18 BC, 3.0e18 AC]      |
+|   M   |  month       |      [7.6e17 BC, 7.6e17 AC]      |
+|   W   |  week        |      [1.7e17 BC, 1.7e17 AC]      |
+|   d   |  day         |      [2.5e16 BC, 2.5e16 AC]      |
+|   h   |  hour        |      [1.0e15 BC, 1.0e15 AC]      |
+|   m   |  minute      |      [1.7e13 BC, 1.7e13 AC]      |
+|   s   |  second      |      [ 2.9e9 BC,  2.9e9 AC]      |
+|   ms  |  millisecond |      [ 2.9e6 BC,  2.9e6 AC]      |
+|   us  |  microsecond |      [290301 BC, 294241 AC]      |
+|   ns  |  nanosecond  |      [  1678 AC,   2262 AC]      |
++----------------------+----------------------------------+
+
+Building a ``datetime64`` dtype
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The proposed way to specify the resolution in the dtype constructor
+is:
+
+Using parameters in the constructor::
+
+  dtype('datetime64', res="us")  # the default res. is microseconds
+
+Using the long string notation::
+
+  dtype('datetime64[us]')   # equivalent to dtype('datetime64')
+
+Using the short string notation::
+
+  dtype('T8[us]')   # equivalent to dtype('T8')
+
+Compatibility issues
+~~~~~~~~~~~~~~~~~~~~
+
+This will be fully compatible with the ``datetime`` class of the
+``datetime`` module of Python only when using a resolution of
+microseconds.  For other resolutions, the conversion process will
+loose precision or will overflow as needed.
+
+
+``timedelta64``
+---------------
+
+It represents a time that is relative (i.e. not absolute).  It is
+implemented internally as an ``int64`` type.
+
+Resolution
+~~~~~~~~~~
+
+It accepts different resolutions and for each of these resolutions, it
+will support different time spans.  The table below describes the
+resolutions supported with its corresponding time spans.
+
++----------------------+--------------------------+
+|     Resolution       |         Time span        |
++----------------------+--------------------------+
+|  Code |   Meaning    |                          |
++======================+==========================+
+|   W   |  week        |      +- 1.7e17 years     |
+|   D   |  day         |      +- 2.5e16 years     |
+|   h   |  hour        |      +- 1.0e15 years     |
+|   m   |  minute      |      +- 1.7e13 years     |
+|   s   |  second      |      +- 2.9e12 years     |
+|   ms  |  millisecond |      +- 2.9e9 years      |
+|   us  |  microsecond |      +- 2.9e6 years      |
+|   ns  |  nanosecond  |      +- 292 years        |
+|   ps  |  picosecond  |      +- 106 days         |
+|   fs  |  femtosecond |      +- 2.6 hours        |
+|   as  |  attosecond  |      +- 9.2 seconds      |
++----------------------+--------------------------+
+
+Building a ``timedelta64`` dtype
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The proposed way to specify the resolution in the dtype constructor
+is:
+
+Using parameters in the constructor::
+
+  dtype('timedelta64', res="us")  # the default res. is microseconds
+
+Using the long string notation::
+
+  dtype('timedelta64[us]')   # equivalent to dtype('datetime64')
+
+Using the short string notation::
+
+  dtype('t8[us]')   # equivalent to dtype('t8')
+
+Compatibility issues
+~~~~~~~~~~~~~~~~~~~~
+
+This will be fully compatible with the ``timedelta`` class of the
+``datetime`` module of Python only when using a resolution of
+microseconds.  For other resolutions, the conversion process will
+loose precision or will overflow as needed.
+
+
+Example of use
+==============
+
+Here it is an example of use for the ``datetime64``::
+
+  In [10]: t = numpy.zeros(5, dtype="datetime64[ms]")
+
+  In [11]: t[0] = datetime.datetime.now()  # setter in action
+
+  In [12]: t[0]
+  Out[12]: '2008-07-16T13:39:25.315'   # representation in ISO 8601 format
+
+  In [13]: print t
+  [2008-07-16T13:39:25.315  1970-01-01T00:00:00.0
+  1970-01-01T00:00:00.0  1970-01-01T00:00:00.0  1970-01-01T00:00:00.0]
+
+  In [14]: t[0].item()     # getter in action
+  Out[14]: datetime.datetime(2008, 7, 16, 13, 39, 25, 315000)
+
+  In [15]: print t.dtype
+  datetime64[ms]
+
+And here it goes an example of use for the ``timedelta64``::
+
+  In [8]: t1 = numpy.zeros(5, dtype="datetime64[s]")
+
+  In [9]: t2 = numpy.ones(5, dtype="datetime64[s]")
+
+  In [10]: t = t2 - t1
+
+  In [11]: t[0] = 24  # setter in action (setting to 24 seconds)
+
+  In [12]: t[0]
+  Out[12]: 24       # representation as an int64
+
+  In [13]: print t
+  [24  1  1  1  1]
+
+  In [14]: t[0].item()     # getter in action
+  Out[14]: datetime.timedelta(0, 24)
+
+  In [15]: print t.dtype
+  timedelta64[s]
+
+
+Operating with date/time arrays
+===============================
+
+``datetime64`` vs ``datetime64``
+--------------------------------
+
+The only operation allowed between absolute dates is the subtraction::
+
+  In [10]: numpy.ones(5, "T8") - numpy.zeros(5, "T8")
+  Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us])
+
+But not other operations::
+
+  In [11]: numpy.ones(5, "T8") + numpy.zeros(5, "T8")
+  TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray'
+
+``datetime64`` vs ``timedelta64``
+---------------------------------
+
+It will be possible to add and subtract relative times from absolute
+dates::
+
+  In [10]: numpy.zeros(5, "T8[Y]") + numpy.ones(5, "t8[Y]")
+  Out[10]: array([1971, 1971, 1971, 1971, 1971], dtype=datetime64[Y])
+
+  In [11]: numpy.ones(5, "T8[Y]") - 2 * numpy.ones(5, "t8[Y]")
+  Out[11]: array([1969, 1969, 1969, 1969, 1969], dtype=datetime64[Y])
+
+But not other operations::
+
+  In [12]: numpy.ones(5, "T8[Y]") * numpy.ones(5, "t8[Y]")
+  TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray'
+
+``timedelta64`` vs anything
+---------------------------
+
+Finally, it will be possible to operate with relative times as if they
+were regular int64 dtypes *as long as* the result can be converted back
+into a ``timedelta64``::
+
+  In [10]: numpy.ones(5, 't8')
+  Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us])
+
+  In [11]: (numpy.ones(5, 't8[M]') + 2) ** 3
+  Out[11]: array([27, 27, 27, 27, 27], dtype=timedelta64[M])
+
+But::
+
+  In [12]: numpy.ones(5, 't8') + 1j
+  TypeError: The result cannot be converted into a ``timedelta64``
+
+
+dtype/resolution conversions
+============================
+
+For changing the date/time dtype of an existing array, we propose to use
+the ``.astype()`` method.  This will be mainly useful for changing
+resolutions.
+
+For example, for absolute dates::
+
+  In[10]: t1 = numpy.zeros(5, dtype="datetime64[s]")
+
+  In[11]: print t1
+  [1970-01-01T00:00:00  1970-01-01T00:00:00  1970-01-01T00:00:00
+   1970-01-01T00:00:00  1970-01-01T00:00:00]
+
+  In[12]: print t1.astype('datetime64[d]')
+  [1970-01-01  1970-01-01  1970-01-01  1970-01-01  1970-01-01]
+
+For relative times::
+
+  In[10]: t1 = numpy.ones(5, dtype="timedelta64[s]")
+
+  In[11]: print t1
+  [1 1 1 1 1]
+
+  In[12]: print t1.astype('timedelta64[ms]')
+  [1000 1000 1000 1000 1000]
+
+Changing directly from/to relative to/from absolute dtypes will not be
+supported::
+
+  In[13]: numpy.zeros(5, dtype="datetime64[s]").astype('timedelta64')
+  TypeError: data type cannot be converted to the desired type
+
+
+Final considerations
+====================
+
+Why the ``origin`` metadata disappeared
+---------------------------------------
+
+During the discussion of the date/time dtypes in the NumPy list, the
+idea of having an ``origin`` metadata that complemented the definition
+of the absolute ``datetime64`` was initially found to be useful.
+
+However, after thinking more about this, Ivan and me find that the
+combination of an absolute ``datetime64`` with a relative
+``timedelta64`` does offer the same functionality while removing the
+need for the additional ``origin`` metadata.  This is why we have
+removed it from this proposal.
+
+
+Resolution and dtype issues
+---------------------------
+
+The date/time dtype's resolution metadata cannot be used in general as
+part of typical dtype usage.  For example, in::
+
+  numpy.zeros(5, dtype=numpy.datetime64)
+
+we have to found yet a sensible way to pass the resolution.  Perhaps the
+next would work::
+
+  numpy.zeros(5, dtype=numpy.datetime64(res='Y'))
+
+but we are not sure if this would collide with the spirit of the NumPy
+dtypes.
+
+At any rate, one can always do::
+
+  numpy.zeros(5, dtype=numpy.dtype('datetime64', res='Y'))
+
+BTW, prior to all of this, one should also elucidate whether::
+
+  numpy.dtype('datetime64', res='Y')
+
+or::
+
+   numpy.dtype('datetime64[Y]')
+   numpy.dtype('T8[Y]')
+
+would be a consistent way to instantiate a dtype in NumPy.  We do really
+think that could be a good way, but we would need to hear the opinion of
+the expert.  Travis?
+
+
+
+.. [1] http://docs.python.org/lib/module-datetime.html
+.. [2] http://www.egenix.com/products/python/mxBase/mxDateTime
+.. [3] http://en.wikipedia.org/wiki/Unix_time
+
+
+.. Local Variables:
+.. mode: rst
+.. coding: utf-8
+.. fill-column: 72
+.. End:
+

Copied: trunk/doc/neps/npy-format.txt (from rev 5726, trunk/doc/npy-format.txt)

Copied: trunk/doc/neps/pep_buffer.txt (from rev 5726, trunk/doc/pep_buffer.txt)

Deleted: trunk/doc/npy-format.txt
===================================================================
--- trunk/doc/npy-format.txt	2008-08-31 03:21:01 UTC (rev 5729)
+++ trunk/doc/npy-format.txt	2008-08-31 03:28:56 UTC (rev 5730)
@@ -1,294 +0,0 @@
-Title: A Simple File Format for NumPy Arrays
-Discussions-To: numpy-discussion@mail.scipy.org
-Version: $Revision$
-Last-Modified: $Date$
-Author: Robert Kern <robert.kern@gmail.com>
-Status: Draft
-Type: Standards Track
-Content-Type: text/plain
-Created: 20-Dec-2007
-
-
-Abstract
-
-    We propose a standard binary file format (NPY) for persisting
-    a single arbitrary NumPy array on disk.  The format stores all of
-    the shape and dtype information necessary to reconstruct the array
-    correctly even on another machine with a different architecture.
-    The format is designed to be as simple as possible while achieving
-    its limited goals.  The implementation is intended to be pure
-    Python and distributed as part of the main numpy package.
-
-
-Rationale
-
-    A lightweight, omnipresent system for saving NumPy arrays to disk
-    is a frequent need.  Python in general has pickle [1] for saving
-    most Python objects to disk.  This often works well enough with
-    NumPy arrays for many purposes, but it has a few drawbacks:
-
-    - Dumping or loading a pickle file require the duplication of the
-      data in memory.  For large arrays, this can be a showstopper.
-
-    - The array data is not directly accessible through
-      memory-mapping.  Now that numpy has that capability, it has
-      proved very useful for loading large amounts of data (or more to
-      the point: avoiding loading large amounts of data when you only
-      need a small part).
-
-    Both of these problems can be addressed by dumping the raw bytes
-    to disk using ndarray.tofile() and numpy.fromfile().  However,
-    these have their own problems:
-
-    - The data which is written has no information about the shape or
-      dtype of the array.
-
-    - It is incapable of handling object arrays.
-
-    The NPY file format is an evolutionary advance over these two
-    approaches.  Its design is mostly limited to solving the problems
-    with pickles and tofile()/fromfile().  It does not intend to solve
-    more complicated problems for which more complicated formats like
-    HDF5 [2] are a better solution.
-
-
-Use Cases
-
-    - Neville Newbie has just started to pick up Python and NumPy.  He
-      has not installed many packages, yet, nor learned the standard
-      library, but he has been playing with NumPy at the interactive
-      prompt to do small tasks.  He gets a result that he wants to
-      save.
-
-    - Annie Analyst has been using large nested record arrays to
-      represent her statistical data.  She wants to convince her
-      R-using colleague, David Doubter, that Python and NumPy are
-      awesome by sending him her analysis code and data.  She needs
-      the data to load at interactive speeds.  Since David does not
-      use Python usually, needing to install large packages would turn
-      him off.
-
-    - Simon Seismologist is developing new seismic processing tools.
-      One of his algorithms requires large amounts of intermediate
-      data to be written to disk.  The data does not really fit into
-      the industry-standard SEG-Y schema, but he already has a nice
-      record-array dtype for using it internally.
-
-    - Polly Parallel wants to split up a computation on her multicore
-      machine as simply as possible.  Parts of the computation can be
-      split up among different processes without any communication
-      between processes; they just need to fill in the appropriate
-      portion of a large array with their results.  Having several
-      child processes memory-mapping a common array is a good way to
-      achieve this.
-
-
-Requirements
-
-    The format MUST be able to:
-
-    - Represent all NumPy arrays including nested record
-      arrays and object arrays.
-
-    - Represent the data in its native binary form.
-
-    - Be contained in a single file.
-
-    - Support Fortran-contiguous arrays directly.
-
-    - Store all of the necessary information to reconstruct the array
-      including shape and dtype on a machine of a different
-      architecture.  Both little-endian and big-endian arrays must be
-      supported and a file with little-endian numbers will yield
-      a little-endian array on any machine reading the file.  The
-      types must be described in terms of their actual sizes.  For
-      example, if a machine with a 64-bit C "long int" writes out an
-      array with "long ints", a reading machine with 32-bit C "long
-      ints" will yield an array with 64-bit integers.
-
-    - Be reverse engineered.  Datasets often live longer than the
-      programs that created them.  A competent developer should be
-      able create a solution in his preferred programming language to
-      read most NPY files that he has been given without much
-      documentation.
-
-    - Allow memory-mapping of the data.
-
-    - Be read from a filelike stream object instead of an actual file.
-      This allows the implementation to be tested easily and makes the
-      system more flexible.  NPY files can be stored in ZIP files and
-      easily read from a ZipFile object.
-
-    - Store object arrays.  Since general Python objects are
-      complicated and can only be reliably serialized by pickle (if at
-      all), many of the other requirements are waived for files
-      containing object arrays.  Files with object arrays do not have
-      to be mmapable since that would be technically impossible.  We
-      cannot expect the pickle format to be reverse engineered without
-      knowledge of pickle.  However, one should at least be able to
-      read and write object arrays with the same generic interface as
-      other arrays.
-
-    - Be read and written using APIs provided in the numpy package
-      itself without any other libraries.  The implementation inside
-      numpy may be in C if necessary.
-
-    The format explicitly *does not* need to:
-
-    - Support multiple arrays in a file.  Since we require filelike
-      objects to be supported, one could use the API to build an ad
-      hoc format that supported multiple arrays.  However, solving the
-      general problem and use cases is beyond the scope of the format
-      and the API for numpy.
-
-    - Fully handle arbitrary subclasses of numpy.ndarray.  Subclasses
-      will be accepted for writing, but only the array data will be
-      written out.  A regular numpy.ndarray object will be created
-      upon reading the file.  The API can be used to build a format
-      for a particular subclass, but that is out of scope for the
-      general NPY format.
-
-
-Format Specification: Version 1.0
-
-    The first 6 bytes are a magic string: exactly "\x93NUMPY".
-
-    The next 1 byte is an unsigned byte: the major version number of
-    the file format, e.g. \x01.
-
-    The next 1 byte is an unsigned byte: the minor version number of
-    the file format, e.g. \x00.  Note: the version of the file format
-    is not tied to the version of the numpy package.
-
-    The next 2 bytes form a little-endian unsigned short int: the
-    length of the header data HEADER_LEN.
-
-    The next HEADER_LEN bytes form the header data describing the
-    array's format.  It is an ASCII string which contains a Python
-    literal expression of a dictionary.  It is terminated by a newline
-    ('\n') and padded with spaces ('\x20') to make the total length of
-    the magic string + 4 + HEADER_LEN be evenly divisible by 16 for
-    alignment purposes.
-
-    The dictionary contains three keys:
-
-        "descr" : dtype.descr
-            An object that can be passed as an argument to the
-            numpy.dtype() constructor to create the array's dtype.
-
-        "fortran_order" : bool
-            Whether the array data is Fortran-contiguous or not.
-            Since Fortran-contiguous arrays are a common form of
-            non-C-contiguity, we allow them to be written directly to
-            disk for efficiency.
-
-        "shape" : tuple of int
-            The shape of the array.
-
-    For repeatability and readability, this dictionary is formatted
-    using pprint.pformat() so the keys are in alphabetic order.
-
-    Following the header comes the array data.  If the dtype contains
-    Python objects (i.e. dtype.hasobject is True), then the data is
-    a Python pickle of the array.  Otherwise the data is the
-    contiguous (either C- or Fortran-, depending on fortran_order)
-    bytes of the array.  Consumers can figure out the number of bytes
-    by multiplying the number of elements given by the shape (noting
-    that shape=() means there is 1 element) by dtype.itemsize.
-
-
-Conventions
-
-    We recommend using the ".npy" extension for files following this
-    format.  This is by no means a requirement; applications may wish
-    to use this file format but use an extension specific to the
-    application.  In the absence of an obvious alternative, however,
-    we suggest using ".npy".
-
-    For a simple way to combine multiple arrays into a single file,
-    one can use ZipFile to contain multiple ".npy" files.  We
-    recommend using the file extension ".npz" for these archives.
-
-
-Alternatives
-
-    The author believes that this system (or one along these lines) is
-    about the simplest system that satisfies all of the requirements.
-    However, one must always be wary of introducing a new binary
-    format to the world.
-
-    HDF5 [2] is a very flexible format that should be able to
-    represent all of NumPy's arrays in some fashion.  It is probably
-    the only widely-used format that can faithfully represent all of
-    NumPy's array features.  It has seen substantial adoption by the
-    scientific community in general and the NumPy community in
-    particular.  It is an excellent solution for a wide variety of
-    array storage problems with or without NumPy.
-
-    HDF5 is a complicated format that more or less implements
-    a hierarchical filesystem-in-a-file.  This fact makes satisfying
-    some of the Requirements difficult.  To the author's knowledge, as
-    of this writing, there is no application or library that reads or
-    writes even a subset of HDF5 files that does not use the canonical
-    libhdf5 implementation.  This implementation is a large library
-    that is not always easy to build.  It would be infeasible to
-    include it in numpy.
-
-    It might be feasible to target an extremely limited subset of
-    HDF5.  Namely, there would be only one object in it: the array.
-    Using contiguous storage for the data, one should be able to
-    implement just enough of the format to provide the same metadata
-    that the proposed format does.  One could still meet all of the
-    technical requirements like mmapability.
-
-    We would accrue a substantial benefit by being able to generate
-    files that could be read by other HDF5 software.  Furthermore, by
-    providing the first non-libhdf5 implementation of HDF5, we would
-    be able to encourage more adoption of simple HDF5 in applications
-    where it was previously infeasible because of the size of the
-    library.  The basic work may encourage similar dead-simple
-    implementations in other languages and further expand the
-    community.
-
-    The remaining concern is about reverse engineerability of the
-    format.  Even the simple subset of HDF5 would be very difficult to
-    reverse engineer given just a file by itself.  However, given the
-    prominence of HDF5, this might not be a substantial concern.
-
-    In conclusion, we are going forward with the design laid out in
-    this document.  If someone writes code to handle the simple subset
-    of HDF5 that would be useful to us, we may consider a revision of
-    the file format.
-
-
-Implementation
-
-    The current implementation is in the trunk of the numpy SVN
-    repository and will be part of the 1.0.5 release.
-
-        http://svn.scipy.org/svn/numpy/trunk
-
-    Specifically, the file format.py in this directory implements the
-    format as described here.
-
-
-References
-
-    [1] http://docs.python.org/lib/module-pickle.html
-
-    [2] http://hdf.ncsa.uiuc.edu/products/hdf5/index.html
-
-
-Copyright
-
-    This document has been placed in the public domain.
-
-
-
-Local Variables:
-mode: indented-text
-indent-tabs-mode: nil
-sentence-end-double-space: t
-fill-column: 70
-coding: utf-8
-End:

Deleted: trunk/doc/pep_buffer.txt
===================================================================
--- trunk/doc/pep_buffer.txt	2008-08-31 03:21:01 UTC (rev 5729)
+++ trunk/doc/pep_buffer.txt	2008-08-31 03:28:56 UTC (rev 5730)
@@ -1,869 +0,0 @@
-:PEP: 3118
-:Title: Revising the buffer protocol
-:Version: $Revision$
-:Last-Modified: $Date$
-:Authors: Travis Oliphant <oliphant@ee.byu.edu>, Carl Banks <pythondev@aerojockey.com>
-:Status: Draft
-:Type: Standards Track
-:Content-Type: text/x-rst
-:Created: 28-Aug-2006
-:Python-Version: 3000
-
-Abstract
-========
-
-This PEP proposes re-designing the buffer interface (PyBufferProcs
-function pointers) to improve the way Python allows memory sharing
-in Python 3.0
-
-In particular, it is proposed that the character buffer portion
-of the API be elminated and the multiple-segment portion be
-re-designed in conjunction with allowing for strided memory
-to be shared.   In addition, the new buffer interface will
-allow the sharing of any multi-dimensional nature of the
-memory and what data-format the memory contains.
-
-This interface will allow any extension module to either
-create objects that share memory or create algorithms that
-use and manipulate raw memory from arbitrary objects that
-export the interface.
-
-
-Rationale
-=========
-
-The Python 2.X buffer protocol allows different Python types to
-exchange a pointer to a sequence of internal buffers.  This
-functionality is *extremely* useful for sharing large segments of
-memory between different high-level objects, but it is too limited and
-has issues:
-
-1. There is the little used "sequence-of-segments" option
-   (bf_getsegcount) that is not well motivated.
-
-2. There is the apparently redundant character-buffer option
-   (bf_getcharbuffer)
-
-3. There is no way for a consumer to tell the buffer-API-exporting
-   object it is "finished" with its view of the memory and
-   therefore no way for the exporting object to be sure that it is
-   safe to reallocate the pointer to the memory that it owns (for
-   example, the array object reallocating its memory after sharing
-   it with the buffer object which held the original pointer led
-   to the infamous buffer-object problem).
-
-4. Memory is just a pointer with a length. There is no way to
-   describe what is "in" the memory (float, int, C-structure, etc.)
-
-5. There is no shape information provided for the memory.  But,
-   several array-like Python types could make use of a standard
-   way to describe the shape-interpretation of the memory
-   (wxPython, GTK, pyQT, CVXOPT, PyVox, Audio and Video
-   Libraries, ctypes, NumPy, data-base interfaces, etc.)
-
-6. There is no way to share discontiguous memory (except through
-   the sequence of segments notion).
-
-   There are two widely used libraries that use the concept of
-   discontiguous memory: PIL and NumPy.  Their view of discontiguous
-   arrays is different, though.  The proposed buffer interface allows
-   sharing of either memory model.  Exporters will use only one
-   approach and consumers may choose to support discontiguous
-   arrays of each type however they choose.
-
-   NumPy uses the notion of constant striding in each dimension as its
-   basic concept of an array. With this concept, a simple sub-region
-   of a larger array can be described without copying the data.
-   Thus, stride information is the additional information that must be
-   shared.
-
-   The PIL uses a more opaque memory representation. Sometimes an
-   image is contained in a contiguous segment of memory, but sometimes
-   it is contained in an array of pointers to the contiguous segments
-   (usually lines) of the image.  The PIL is where the idea of multiple
-   buffer segments in the original buffer interface came from.
-
-   NumPy's strided memory model is used more often in computational
-   libraries and because it is so simple it makes sense to support
-   memory sharing using this model.  The PIL memory model is sometimes
-   used in C-code where a 2-d array can be then accessed using double
-   pointer indirection:  e.g. image[i][j].
-
-   The buffer interface should allow the object to export either of these
-   memory models.  Consumers are free to either require contiguous memory
-   or write code to handle one or both of these memory models.
-
-Proposal Overview
-=================
-
-* Eliminate the char-buffer and multiple-segment sections of the
-  buffer-protocol.
-
-* Unify the read/write versions of getting the buffer.
-
-* Add a new function to the interface that should be called when
-  the consumer object is "done" with the memory area.
-
-* Add a new variable to allow the interface to describe what is in
-  memory (unifying what is currently done now in struct and
-  array)
-
-* Add a new variable to allow the protocol to share shape information
-
-* Add a new variable for sharing stride information
-
-* Add a new mechanism for sharing arrays that must
-  be accessed using pointer indirection.
-
-* Fix all objects in the core and the standard library to conform
-  to the new interface
-
-* Extend the struct module to handle more format specifiers
-
-* Extend the buffer object into a new memory object which places
-  a Python veneer around the buffer interface.
-
-* Add a few functions to make it easy to copy contiguous data
-  in and out of object supporting the buffer interface.
-
-Specification
-=============
-
-While the new specification allows for complicated memory sharing.
-Simple contiguous buffers of bytes can still be obtained from an
-object.  In fact, the new protocol allows a standard mechanism for
-doing this even if the original object is not represented as a
-contiguous chunk of memory.
-
-The easiest way to obtain a simple contiguous chunk of memory is
-to use the provided C-API to obtain a chunk of memory.
-
-
-Change the PyBufferProcs structure to
-
-::
-
-    typedef struct {
-         getbufferproc bf_getbuffer;
-         releasebufferproc bf_releasebuffer;
-    }
-
-
-::
-
-    typedef int (*getbufferproc)(PyObject *obj, PyBuffer *view, int flags)
-
-This function returns 0 on success and -1 on failure (and raises an
-error). The first variable is the "exporting" object.  The second
-argument is the address to a bufferinfo structure.  If view is NULL,
-then no information is returned but a lock on the memory is still
-obtained.  In this case, the corresponding releasebuffer should also
-be called with NULL.
-
-The third argument indicates what kind of buffer the exporter is
-allowed to return.  It essentially tells the exporter what kind of
-memory area the consumer can deal with.  It also indicates what
-members of the PyBuffer structure the consumer is going to care about.
-
-The exporter can use this information to simplify how much of the PyBuffer
-structure is filled in and/or raise an error if the object can't support
-a simpler view of its memory.
-
-Thus, the caller can request a simple "view" and either receive it or
-have an error raised if it is not possible.
-
-All of the following assume that at least buf, len, and readonly
-will always be utilized by the caller.
-
-Py_BUF_SIMPLE
-
-   The returned buffer will be assumed to be readable (the object may
-   or may not have writeable memory).  Only the buf, len, and readonly
-   variables may be accessed. The format will be assumed to be
-   unsigned bytes .  This is a "stand-alone" flag constant.  It never
-   needs to be \|'d to the others.  The exporter will raise an
-   error if it cannot provide such a contiguous buffer.
-
-Py_BUF_WRITEABLE
-
-   The returned buffer must be writeable.  If it is not writeable,
-   then raise an error.
-
-Py_BUF_READONLY
-
-   The returned buffer must be readonly.  If the object is already
-   read-only or it can make its memory read-only (and there are no
-   other views on the object) then it should do so and return the
-   buffer information.  If the object does not have read-only memory
-   (or cannot make it read-only), then an error should be raised.
-
-Py_BUF_FORMAT
-
-   The returned buffer must have true format information.  This would
-   be used when the consumer is going to be checking for what 'kind'
-   of data is actually stored.  An exporter should always be able
-   to provide this information if requested.
-
-Py_BUF_SHAPE
-
-   The returned buffer must have shape information.  The memory will
-   be assumed C-style contiguous (last dimension varies the fastest).
-   The exporter may raise an error if it cannot provide this kind
-   of contiguous buffer.
-
-Py_BUF_STRIDES (implies Py_BUF_SHAPE)
-
-   The returned buffer must have strides information. This would be
-   used when the consumer can handle strided, discontiguous arrays.
-   Handling strides automatically assumes you can handle shape.
-   The exporter may raise an error if cannot provide a strided-only
-   representation of the data (i.e. without the suboffsets).
-
-Py_BUF_OFFSETS (implies Py_BUF_STRIDES)
-
-   The returned buffer must have suboffsets information.  This would
-   be used when the consumer can handle indirect array referencing
-   implied by these suboffsets.
-
-Py_BUF_FULL (Py_BUF_OFFSETS | Py_BUF_WRITEABLE | Py_BUF_FORMAT)
-
-Thus, the consumer simply wanting a contiguous chunk of bytes from
-the object would use Py_BUF_SIMPLE, while a consumer that understands
-how to make use of the most complicated cases could use Py_BUF_INDIRECT.
-
-If format information is going to be probed, then Py_BUF_FORMAT must
-be \|'d to the flags otherwise the consumer assumes it is unsigned
-bytes.
-
-There is a C-API that simple exporting objects can use to fill-in the
-buffer info structure correctly according to the provided flags if a
-contiguous chunk of "unsigned bytes" is all that can be exported.
-
-
-The bufferinfo structure is::
-
-  struct bufferinfo {
-       void *buf;
-       Py_ssize_t len;
-       int readonly;
-       const char *format;
-       int ndims;
-       Py_ssize_t *shape;
-       Py_ssize_t *strides;
-       Py_ssize_t *suboffsets;
-       int itemsize;
-       void *internal;
-  } PyBuffer;
-
-Before calling this function, the bufferinfo structure can be filled
-with whatever.  Upon return from getbufferproc, the bufferinfo
-structure is filled in with relevant information about the buffer.
-This same bufferinfo structure must be passed to bf_releasebuffer (if
-available) when the consumer is done with the memory. The caller is
-responsible for keeping a reference to obj until releasebuffer is
-called (i.e. this call does not alter the reference count of obj).
-
-The members of the bufferinfo structure are:
-
-buf
-    a pointer to the start of the memory for the object
-
-len
-    the total bytes of memory the object uses.  This should be the
-    same as the product of the shape array multiplied by the number of
-    bytes per item of memory.
-
-readonly
-    an integer variable to hold whether or not the memory is
-    readonly.  1 means the memory is readonly, zero means the
-    memory is writeable.
-
-format
-    a NULL-terminated format-string (following the struct-style syntax
-    including extensions) indicating what is in each element of
-    memory.  The number of elements is len / itemsize, where itemsize
-    is the number of bytes implied by the format.  For standard
-    unsigned bytes use a format string of "B".
-
-ndims
-    a variable storing the number of dimensions the memory represents.
-    Must be >=0.
-
-shape
-    an array of ``Py_ssize_t`` of length ``ndims`` indicating the
-    shape of the memory as an N-D array.  Note that ``((*shape)[0] *
-    ... * (*shape)[ndims-1])*itemsize = len``.  If ndims is 0 (indicating
-    a scalar), then this must be NULL.
-
-strides
-    address of a ``Py_ssize_t*`` variable that will be filled with a
-    pointer to an array of ``Py_ssize_t`` of length ``ndims`` (or NULL
-    if ndims is 0).  indicating the number of bytes to skip to get to
-    the next element in each dimension.  If this is not requested by
-    the caller (BUF_STRIDES is not set), then this member of the
-    structure will not be used and the consumer is assuming the array
-    is C-style contiguous.  If this is not the case, then an error
-    should be raised.  If this member is requested by the caller
-    (BUF_STRIDES is set), then it must be filled in.
-
-
-suboffsets
-    address of a ``Py_ssize_t *`` variable that will be filled with a
-    pointer to an array of ``Py_ssize_t`` of length ``*ndims``.  If
-    these suboffset numbers are >=0, then the value stored along the
-    indicated dimension is a pointer and the suboffset value dictates
-    how many bytes to add to the pointer after de-referencing.  A
-    suboffset value that it negative indicates that no de-referencing
-    should occur (striding in a contiguous memory block).  If all
-    suboffsets are negative (i.e. no de-referencing is needed, then
-    this must be NULL.
-
-    For clarity, here is a function that returns a pointer to the
-    element in an N-D array pointed to by an N-dimesional index when
-    there are both strides and suboffsets.::
-
-      void* get_item_pointer(int ndim, void* buf, Py_ssize_t* strides,
-                           Py_ssize_t* suboffsets, Py_ssize_t *indices) {
-          char* pointer = (char*)buf;
-          int i;
-          for (i = 0; i < ndim; i++) {
-              pointer += strides[i]*indices[i];
-              if (suboffsets[i] >=0 ) {
-                  pointer = *((char**)pointer) + suboffsets[i];
-              }
-          }
-          return (void*)pointer;
-      }
-
-    Notice the suboffset is added "after" the dereferencing occurs.
-    Thus slicing in the ith dimension would add to the suboffsets in
-    the (i-1)st dimension.  Slicing in the first dimension would change
-    the location of the starting pointer directly (i.e. buf would
-    be modified).
-
-itemsize
-    This is a storage for the itemsize of each element of the shared
-    memory.  It can be obtained using PyBuffer_SizeFromFormat but an
-    exporter may know it without making this call and thus storing it
-    is more convenient and faster.
-
-internal
-    This is for use internally by the exporting object.  For example,
-    this might be re-cast as an integer by the exporter and used to
-    store flags about whether or not the shape, strides, and suboffsets
-    arrays must be freed when the buffer is released.   The consumer
-    should never touch this value.
-
-
-The exporter is responsible for making sure the memory pointed to by
-buf, format, shape, strides, and suboffsets is valid until
-releasebuffer is called.  If the exporter wants to be able to change
-shape, strides, and/or suboffsets before releasebuffer is called then
-it should allocate those arrays when getbuffer is called (pointing to
-them in the buffer-info structure provided) and free them when
-releasebuffer is called.
-
-
-The same bufferinfo struct should be used in the release-buffer
-interface call. The caller is responsible for the memory of the
-bufferinfo structure itself.
-
-``typedef int (*releasebufferproc)(PyObject *obj, PyBuffer *view)``
-    Callers of getbufferproc must make sure that this function is
-    called when memory previously acquired from the object is no
-    longer needed.  The exporter of the interface must make sure that
-    any memory pointed to in the bufferinfo structure remains valid
-    until releasebuffer is called.
-
-    Both of these routines are optional for a type object
-
-    If the releasebuffer function is not provided then it does not ever
-    need to be called.
-
-Exporters will need to define a releasebuffer function if they can
-re-allocate their memory, strides, shape, suboffsets, or format
-variables which they might share through the struct bufferinfo.
-Several mechanisms could be used to keep track of how many getbuffer
-calls have been made and shared.  Either a single variable could be
-used to keep track of how many "views" have been exported, or a
-linked-list of bufferinfo structures filled in could be maintained in
-each object.
-
-All that is specifically required by the exporter, however, is to
-ensure that any memory shared through the bufferinfo structure remains
-valid until releasebuffer is called on the bufferinfo structure.
-
-
-New C-API calls are proposed
-============================
-
-::
-
-    int PyObject_CheckBuffer(PyObject *obj)
-
-Return 1 if the getbuffer function is available otherwise 0.
-
-::
-
-    int PyObject_GetBuffer(PyObject *obj, PyBuffer *view,
-                           int flags)
-
-This is a C-API version of the getbuffer function call.  It checks to
-make sure object has the required function pointer and issues the
-call.  Returns -1 and raises an error on failure and returns 0 on
-success.
-
-::
-
-    int PyObject_ReleaseBuffer(PyObject *obj, PyBuffer *view)
-
-This is a C-API version of the releasebuffer function call.  It checks
-to make sure the object has the required function pointer and issues
-the call.  Returns 0 on success and -1 (with an error raised) on
-failure. This function always succeeds if there is no releasebuffer
-function for the object.
-
-::
-
-    PyObject *PyObject_GetMemoryView(PyObject *obj)
-
-Return a memory-view object from an object that defines the buffer interface.
-
-A memory-view object is an extended buffer object that could replace
-the buffer object (but doesn't have to).  It's C-structure is
-
-::
-
-  typedef struct {
-      PyObject_HEAD
-      PyObject *base;
-      int ndims;
-      Py_ssize_t *starts;  /* slice starts */
-      Py_ssize_t *stops;   /* slice stops */
-      Py_ssize_t *steps;   /* slice steps */
-  } PyMemoryViewObject;
-
-This is functionally similar to the current buffer object except only
-a reference to base is kept.  The actual memory for base must be
-re-grabbed using the buffer-protocol, whenever it is needed.
-
-The getbuffer and releasebuffer for this object use the underlying
-base object (adjusted using the slice information).  If the number of
-dimensions of the base object (or the strides or the size) has changed
-when a new view is requested, then the getbuffer will trigger an error.
-
-This memory-view object will support mult-dimensional slicing.  Slices
-of the memory-view object are other memory-view objects. When an
-"element" from the memory-view is returned it is always a tuple of
-bytes object + format string which can then be interpreted using the
-struct module if desired.
-
-::
-
-    int PyBuffer_SizeFromFormat(const char *)
-
-Return the implied itemsize of the data-format area from a struct-style
-description.
-
-::
-
-    int PyObject_GetContiguous(PyObject *obj, void **buf, Py_ssize_t *len,
-                               char **format, char fortran)
-
-Return a contiguous chunk of memory representing the buffer.  If a
-copy is made then return 1.  If no copy was needed return 0.  If an
-error occurred in probing the buffer interface, then return -1.  The
-contiguous chunk of memory is pointed to by ``*buf`` and the length of
-that memory is ``*len``.  If the object is multi-dimensional, then if
-fortran is 'F', the first dimension of the underlying array will vary
-the fastest in the buffer.  If fortran is 'C', then the last dimension
-will vary the fastest (C-style contiguous). If fortran is 'A', then it
-does not matter and you will get whatever the object decides is more
-efficient.
-
-::
-
-    int PyObject_CopyToObject(PyObject *obj, void *buf, Py_ssize_t len,
-                              char fortran)
-
-Copy ``len`` bytes of data pointed to by the contiguous chunk of
-memory pointed to by ``buf`` into the buffer exported by obj.  Return
-0 on success and return -1 and raise an error on failure.  If the
-object does not have a writeable buffer, then an error is raised.  If
-fortran is 'F', then if the object is multi-dimensional, then the data
-will be copied into the array in Fortran-style (first dimension varies
-the fastest).  If fortran is 'C', then the data will be copied into the
-array in C-style (last dimension varies the fastest).  If fortran is 'A', then
-it does not matter and the copy will be made in whatever way is more
-efficient.
-
-::
-
-    void PyBuffer_FreeMem(void *buf)
-
-This function frees the memory returned by PyObject_GetContiguous if a
-copy was made.  Do not call this function unless
-PyObject_GetContiguous returns a 1 indicating that new memory was
-created.
-
-
-These last three C-API calls allow a standard way of getting data in and
-out of Python objects into contiguous memory areas no matter how it is
-actually stored.  These calls use the extended buffer interface to perform
-their work.
-
-::
-
-    int PyBuffer_IsContiguous(PyBuffer *view, char fortran);
-
-Return 1 if the memory defined by the view object is C-style (fortran = 'C')
-or Fortran-style (fortran = 'A') contiguous.  Return 0 otherwise.
-
-::
-
-    void PyBuffer_FillContiguousStrides(int *ndims, Py_ssize_t *shape,
-                                        int itemsize,
-                                        Py_ssize_t *strides, char fortran)
-
-Fill the strides array with byte-strides of a contiguous (C-style if
-fortran is 0 or Fortran-style if fortran is 1) array of the given
-shape with the given number of bytes per element.
-
-::
-
-    int PyBuffer_FillInfo(PyBuffer *view, void *buf,
-                          Py_ssize_t len, int readonly, int infoflags)
-
-Fills in a buffer-info structure correctly for an exporter that can
-only share a contiguous chunk of memory of "unsigned bytes" of the
-given length.  Returns 0 on success and -1 (with raising an error) on
-error.
-
-
-Additions to the struct string-syntax
-=====================================
-
-The struct string-syntax is missing some characters to fully
-implement data-format descriptions already available elsewhere (in
-ctypes and NumPy for example).  The Python 2.5 specification is
-at http://docs.python.org/lib/module-struct.html
-
-Here are the proposed additions:
-
-
-================  ===========
-Character         Description
-================  ===========
-'t'               bit (number before states how many bits)
-'?'               platform _Bool type
-'g'               long double
-'c'               ucs-1 (latin-1) encoding
-'u'               ucs-2
-'w'               ucs-4
-'O'               pointer to Python Object
-'Z'               complex (whatever the next specifier is)
-'&'               specific pointer (prefix before another charater)
-'T{}'             structure (detailed layout inside {})
-'(k1,k2,...,kn)'  multi-dimensional array of whatever follows
-':name:'          optional name of the preceeding element
-'X{}'             pointer to a function (optional function
-                                         signature inside {})
-' \n\t'           ignored (allow better readability)
-                             -- this may already be true
-================  ===========
-
-The struct module will be changed to understand these as well and
-return appropriate Python objects on unpacking.  Unpacking a
-long-double will return a decimal object or a ctypes long-double.
-Unpacking 'u' or 'w' will return Python unicode.  Unpacking a
-multi-dimensional array will return a list (of lists if >1d).
-Unpacking a pointer will return a ctypes pointer object. Unpacking a
-function pointer will return a ctypes call-object (perhaps). Unpacking
-a bit will return a Python Bool.  White-space in the struct-string
-syntax will be ignored if it isn't already.  Unpacking a named-object
-will return some kind of named-tuple-like object that acts like a
-tuple but whose entries can also be accessed by name. Unpacking a
-nested structure will return a nested tuple.
-
-Endian-specification ('!', '@','=','>','<', '^') is also allowed
-inside the string so that it can change if needed.  The
-previously-specified endian string is in force until changed.  The
-default endian is '@' which means native data-types and alignment.  If
-un-aligned, native data-types are requested, then the endian
-specification is '^'.
-
-According to the struct-module, a number can preceed a character
-code to specify how many of that type there are.  The
-(k1,k2,...,kn) extension also allows specifying if the data is
-supposed to be viewed as a (C-style contiguous, last-dimension
-varies the fastest) multi-dimensional array of a particular format.
-
-Functions should be added to ctypes to create a ctypes object from
-a struct description, and add long-double, and ucs-2 to ctypes.
-
-Examples of Data-Format Descriptions
-====================================
-
-Here are some examples of C-structures and how they would be
-represented using the struct-style syntax.
-
-<named> is the constructor for a named-tuple (not-specified yet).
-
-float
-    'f' <--> Python float
-complex double
-    'Zd' <--> Python complex
-RGB Pixel data
-    'BBB' <--> (int, int, int)
-    'B:r: B:g: B:b:' <--> <named>((int, int, int), ('r','g','b'))
-
-Mixed endian (weird but possible)
-    '>i:big: <i:little:' <--> <named>((int, int), ('big', 'little'))
-
-Nested structure
-    ::
-
-        struct {
-             int ival;
-             struct {
-                 unsigned short sval;
-                 unsigned char bval;
-                 unsigned char cval;
-             } sub;
-        }
-        """i:ival:
-           T{
-              H:sval:
-              B:bval:
-              B:cval:
-            }:sub:
-        """
-Nested array
-    ::
-
-        struct {
-             int ival;
-             double data[16*4];
-        }
-        """i:ival:
-           (16,4)d:data:
-        """
-
-
-Code to be affected
-===================
-
-All objects and modules in Python that export or consume the old
-buffer interface will be modified.  Here is a partial list.
-
-* buffer object
-* bytes object
-* string object
-* array module
-* struct module
-* mmap module
-* ctypes module
-
-Anything else using the buffer API.
-
-
-Issues and Details
-==================
-
-It is intended that this PEP will be back-ported to Python 2.6 by
-adding the C-API and the two functions to the existing buffer
-protocol.
-
-The proposed locking mechanism relies entirely on the exporter object
-to not invalidate any of the memory pointed to by the buffer structure
-until a corresponding releasebuffer is called.  If it wants to be able
-to change its own shape and/or strides arrays, then it needs to create
-memory for these in the bufferinfo structure and copy information
-over.
-
-The sharing of strided memory and suboffsets is new and can be seen as
-a modification of the multiple-segment interface.  It is motivated by
-NumPy and the PIL.  NumPy objects should be able to share their
-strided memory with code that understands how to manage strided memory
-because strided memory is very common when interfacing with compute
-libraries.
-
-Also, with this approach it should be possible to write generic code
-that works with both kinds of memory.
-
-Memory management of the format string, the shape array, the strides
-array, and the suboffsets array in the bufferinfo structure is always
-the responsibility of the exporting object.  The consumer should not
-set these pointers to any other memory or try to free them.
-
-Several ideas were discussed and rejected:
-
-    Having a "releaser" object whose release-buffer was called.  This
-    was deemed unacceptable because it caused the protocol to be
-    asymmetric (you called release on something different than you
-    "got" the buffer from).  It also complicated the protocol without
-    providing a real benefit.
-
-    Passing all the struct variables separately into the function.
-    This had the advantage that it allowed one to set NULL to
-    variables that were not of interest, but it also made the function
-    call more difficult.  The flags variable allows the same
-    ability of consumers to be "simple" in how they call the protocol.
-
-Code
-========
-
-The authors of the PEP promise to contribute and maintain the code for
-this proposal but will welcome any help.
-
-
-
-
-Examples
-=========
-
-Ex. 1
------------
-
-This example shows how an image object that uses contiguous lines might expose its buffer.
-
-::
-
-  struct rgba {
-      unsigned char r, g, b, a;
-  };
-
-  struct ImageObject {
-      PyObject_HEAD;
-      ...
-      struct rgba** lines;
-      Py_ssize_t height;
-      Py_ssize_t width;
-      Py_ssize_t shape_array[2];
-      Py_ssize_t stride_array[2];
-      Py_ssize_t view_count;
-  };
-
-"lines" points to malloced 1-D array of (struct rgba*).  Each pointer
-in THAT block points to a seperately malloced array of (struct rgba).
-
-In order to access, say, the red value of the pixel at x=30, y=50, you'd use "lines[50][30].r".
-
-So what does ImageObject's getbuffer do?  Leaving error checking out::
-
-  int Image_getbuffer(PyObject *self, PyBuffer *view, int flags) {
-
-      static Py_ssize_t suboffsets[2] = { -1, 0 };
-
-      view->buf = self->lines;
-      view->len = self->height*self->width;
-      view->readonly = 0;
-      view->ndims = 2;
-      self->shape_array[0] = height;
-      self->shape_array[1] = width;
-      view->shape = &self->shape_array;
-      self->stride_array[0] = sizeof(struct rgba*);
-      self->stride_array[1] = sizeof(struct rgba);
-      view->strides = &self->stride_array;
-      view->suboffsets = suboffsets;
-
-      self->view_count ++;
-
-      return 0;
-  }
-
-
-  int Image_releasebuffer(PyObject *self, PyBuffer *view) {
-      self->view_count--;
-      return 0;
-  }
-
-
-Ex. 2
------------
-
-This example shows how an object that wants to expose a contiguous
-chunk of memory (which will never be re-allocated while the object is
-alive) would do that.
-
-::
-
-  int myobject_getbuffer(PyObject *self, PyBuffer *view, int flags) {
-
-      void *buf;
-      Py_ssize_t len;
-      int readonly=0;
-
-      buf = /* Point to buffer */
-      len = /* Set to size of buffer */
-      readonly = /* Set to 1 if readonly */
-
-      return PyObject_FillBufferInfo(view, buf, len, readonly, flags);
-  }
-
-No releasebuffer is necessary because the memory will never
-be re-allocated so the locking mechanism is not needed.
-
-Ex.  3
------------
-
-A consumer that wants to only get a simple contiguous chunk of bytes
-from a Python object, obj would do the following:
-
-::
-
-  PyBuffer view;
-  int ret;
-
-  if (PyObject_GetBuffer(obj, &view, Py_BUF_SIMPLE) < 0) {
-       /* error return */
-  }
-
-  /* Now, view.buf is the pointer to memory
-          view.len is the length
-          view.readonly is whether or not the memory is read-only.
-   */
-
-
-  /* After using the information and you don't need it anymore */
-
-  if (PyObject_ReleaseBuffer(obj, &view) < 0) {
-          /* error return */
-  }
-
-
-Ex. 4
------------
-
-A consumer that wants to be able to use any object's memory but is
-writing an algorithm that only handle contiguous memory could do the following:
-
-::
-
-    void *buf;
-    Py_ssize_t len;
-    char *format;
-
-    if (PyObject_GetContiguous(obj, &buf, &len, &format, 0) < 0) {
-       /* error return */
-    }
-
-    /* process memory pointed to by buffer if format is correct */
-
-    /* Optional:
-
-       if, after processing, we want to copy data from buffer back
-       into the the object
-
-       we could do
-       */
-
-    if (PyObject_CopyToObject(obj, buf, len, 0) < 0) {
-           /*        error return */
-    }
-
-
-Copyright
-=========
-
-This PEP is placed in the public domain



More information about the Numpy-svn mailing list