[SciPy-dev] Changing return types in C API

David M. Cooke cookedm at physics.mcmaster.ca
Wed Feb 1 16:35:30 CST 2006


Sasha <ndarray at mac.com> writes:

> This probably belongs to numpy-discussion.  More below.
>
> On 1/31/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
>>  ...
>> Could we change the API so that PyArrayObject * is used as the return
>> type?
>>
>
> -1
>
> Exposing the definition of the object structs is discouraged in python
> code.  See for example a comment in intobject.h /*The type PyIntObject
> is (unfortunately) exposed here so we can declare _Py_TrueStruct and
> _Py_ZeroStruct in boolobject.h; don't use this. */ In numpy this is
> necessary for performance reasons so that accessors can be implemented
> as macros.  Making return types  PyArrayObject * will likely lead
> people to use direct access to struct members instead of macros.

Well, I *already* use direct access :-). That's the way you do it with
Numeric...

But I'll concede your point: the preferred way should be PyArray_DATA
and friends. I notice that CAPI.txt in numpy/doc just mentions the
macros. Changing the data member of PyArrayObject to void* would make
those easier to use.

>> On a similar note: the 'data' member of PyArrayObject's is a char *,
>> where it really should be a void *. Being a void pointer would have
>> the advantage of not needing the cast in cases like this:
>>
>> double *adata = (double)(a->data);
>
> I think you meant (double*)

Yep.

>> It would also mean that accidentally dereferencing the pointer would
>> be a compiler error: currently, a->data[0] is valid, but if a->data
>> was a void *, it wouldn't be.
>>
>
> +1
>
> Note, however that implicit cast is illegal in C++ and may generate
> warnings on some compilers (I think gcc -pedantic would, but not
> sure).

As long as it's fine in the headers, C++ can use the explicit cast.

However, for C++ I'd suggest wrapping using templates. I've got
something like this that I used for Numeric

template <typename T>
struct PyArray_type_info
{
  static PyArray_TYPES type() { return PyArray_NOTYPE; }
  static const char* name() { return "notype"; }
}
template <>
struct PyArray_type_info<double>
{
  static PyArray_TYPES type() { return PyArray_DOUBLE }
  static const char* name() { return "double"; }
}

then, something like

template <typename T>
struct ArrayObject
{
  typedef PyArray_type_info<T> type_info;
  PyObject *a;
  T* data() { return static_cast<T *>(PyArray_DATA(a)); }
  // Add strides, flags, [] and () operators, etc.
}

template <typename T>
ArrayObject<T>
contiguous_from_any(PyObject *obj, int min, int max)
{
  PyObject *ao = PyArray_ContiguousFromAny(obj, PyArray_type_info<T>.type(), min, max);
  ArrayObject<T> a = {ao};
  return a;
}

then your code becomes the simple

ArrayObject<double> a = contiguous_from_any<double>(obj, 1, 1);
double *d = ao.data();

Or something like that; this is untested code. But using template
specialization like this could cut down a lot of casting, and make
things safer in C++. Plus you could add refcount handling and all that :-)

> Another problem with void* is that a->data + n will become
> invalid, which is used in many places if I recall correctly.
> (a->data + n code is actually suboptimal when n is not constant
> because most CPUs have special opcodes for pointer arithmetics that
> compilers can generate if n is compile time constant)

I'll look for those. One trouble is the strides member of
PyArrayObject; it's harder by a bit to apply a stride given in bytes
to a double * (say). Although this is the same problem that you'd have
after casting from the char* to the double *.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca




More information about the Scipy-dev mailing list