[Numpy-discussion] How to copy data from a C array to a numpy array efficiently?

Dag Sverre Seljebotn d.s.seljebotn@astro.uio...
Sun Oct 7 01:49:07 CDT 2012


On 10/07/2012 08:48 AM, Dag Sverre Seljebotn wrote:
> On 10/07/2012 08:41 AM, Jianbao Tao wrote:
>> Hi,
>>
>> I am developing a Python wrapper of the NASA CDF C library in Cython. I
>> have got a working version of it now, but it is slower than the
>> counterpart in IDL. For loading the same file, mine takes about 400 ms,
>> whereas the IDL version takes about 290 ms.
>>
>> The main overhead in my code is caused by a for-loop of
>> element-by-element copying. Here is the relevant code in cython:
>> #-------------------------------- code
>> -----------------------------------------------------
>>      #-- double
>>          realData = numpy.zeros(lenData, np_dtype)
>>
>>          dblEntry = <double *>malloc(lenData * sizeof(double))
>>          status = CDFlib(
>>                         SELECT_, zVAR_RECCOUNT_, numRecs,
>>                         NULL_)
>>          status = CDFlib(
>>                         GET_, zVAR_HYPERDATA_, dblEntry,
>>                         NULL_)
>> for ii in range(lenData):
>>              realData[ii] = dblEntry[ii]
>>          realData.shape = np_shape
>>          free(dblEntry)
>
> You don't say what np_dtype is here (or the Cython variable declaration
> for it).
>
> Assuming it is np.double and "cdef np.ndarray[double] realData", what
> you should do is simple pass the buffer of realData to the CDFlib function:
>
> status = CDFlib(GET_, ..., &realData[0], NULL)
>
> Then there's no need for copying.
>
> This is really what you should do anyway, then if the dtype is different
> leave it to the "astype" function (but then comparisons with IDL should
> take into account the dtype conversion).

To really answer your question (though in this case you should use a 
different approach), what you should use to copy data efficiently is the 
C memcpy function.

Dag Sverre


More information about the NumPy-Discussion mailing list