[SciPy-dev] Changing return types in C API
David M. Cooke
cookedm at physics.mcmaster.ca
Wed Feb 1 16:35:30 CST 2006
Sasha <ndarray at mac.com> writes:
> This probably belongs to numpy-discussion. More below.
>
> On 1/31/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
>> ...
>> Could we change the API so that PyArrayObject * is used as the return
>> type?
>>
>
> -1
>
> Exposing the definition of the object structs is discouraged in python
> code. See for example a comment in intobject.h /*The type PyIntObject
> is (unfortunately) exposed here so we can declare _Py_TrueStruct and
> _Py_ZeroStruct in boolobject.h; don't use this. */ In numpy this is
> necessary for performance reasons so that accessors can be implemented
> as macros. Making return types PyArrayObject * will likely lead
> people to use direct access to struct members instead of macros.
Well, I *already* use direct access :-). That's the way you do it with
Numeric...
But I'll concede your point: the preferred way should be PyArray_DATA
and friends. I notice that CAPI.txt in numpy/doc just mentions the
macros. Changing the data member of PyArrayObject to void* would make
those easier to use.
>> On a similar note: the 'data' member of PyArrayObject's is a char *,
>> where it really should be a void *. Being a void pointer would have
>> the advantage of not needing the cast in cases like this:
>>
>> double *adata = (double)(a->data);
>
> I think you meant (double*)
Yep.
>> It would also mean that accidentally dereferencing the pointer would
>> be a compiler error: currently, a->data[0] is valid, but if a->data
>> was a void *, it wouldn't be.
>>
>
> +1
>
> Note, however that implicit cast is illegal in C++ and may generate
> warnings on some compilers (I think gcc -pedantic would, but not
> sure).
As long as it's fine in the headers, C++ can use the explicit cast.
However, for C++ I'd suggest wrapping using templates. I've got
something like this that I used for Numeric
template <typename T>
struct PyArray_type_info
{
static PyArray_TYPES type() { return PyArray_NOTYPE; }
static const char* name() { return "notype"; }
}
template <>
struct PyArray_type_info<double>
{
static PyArray_TYPES type() { return PyArray_DOUBLE }
static const char* name() { return "double"; }
}
then, something like
template <typename T>
struct ArrayObject
{
typedef PyArray_type_info<T> type_info;
PyObject *a;
T* data() { return static_cast<T *>(PyArray_DATA(a)); }
// Add strides, flags, [] and () operators, etc.
}
template <typename T>
ArrayObject<T>
contiguous_from_any(PyObject *obj, int min, int max)
{
PyObject *ao = PyArray_ContiguousFromAny(obj, PyArray_type_info<T>.type(), min, max);
ArrayObject<T> a = {ao};
return a;
}
then your code becomes the simple
ArrayObject<double> a = contiguous_from_any<double>(obj, 1, 1);
double *d = ao.data();
Or something like that; this is untested code. But using template
specialization like this could cut down a lot of casting, and make
things safer in C++. Plus you could add refcount handling and all that :-)
> Another problem with void* is that a->data + n will become
> invalid, which is used in many places if I recall correctly.
> (a->data + n code is actually suboptimal when n is not constant
> because most CPUs have special opcodes for pointer arithmetics that
> compilers can generate if n is compile time constant)
I'll look for those. One trouble is the strides member of
PyArrayObject; it's harder by a bit to apply a stride given in bytes
to a double * (say). Although this is the same problem that you'd have
after casting from the char* to the double *.
--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca
More information about the Scipy-dev
mailing list