[Numpy-discussion] How to copy data from a C array to a numpy array efficiently?

Jianbao Tao jianbao.tao@gmail....
Sun Oct 7 01:41:12 CDT 2012


Hi,

I am developing a Python wrapper of the NASA CDF C library in Cython. I
have got a working version of it now, but it is slower than the counterpart
in IDL. For loading the same file, mine takes about 400 ms, whereas the IDL
version takes about 290 ms.

The main overhead in my code is caused by a for-loop of element-by-element
copying. Here is the relevant code in cython:
#-------------------------------- code
-----------------------------------------------------
    #-- double
        realData = numpy.zeros(lenData, np_dtype)

        dblEntry = <double *>malloc(lenData * sizeof(double))
        status = CDFlib(
                       SELECT_, zVAR_RECCOUNT_, numRecs,
                       NULL_)
        status = CDFlib(
                       GET_, zVAR_HYPERDATA_, dblEntry,
                       NULL_)
        for ii in range(lenData):
            realData[ii] = dblEntry[ii]
        realData.shape = np_shape
        free(dblEntry)
#--------------------------------- end of code
-------------------------------------------
The time-consuming for-loop is highlighted in red. If I change
range(lenData) to range(lenData/2), the cpu time will be down from 400 ms
to 230 ms for the case I mentioned above. Because the element-by-element
copying for-loop seems pretty naive to me, I am wondering if there is a
better way to copy data from the C array, dblEntry, to the numpy array,
realData.

I tried the numpy C API PyArray_NewFromDescr with flag NPY_ENSURECOPY, but
didn't get any luck. On the one hand, the flag above didn't seem to work as
I expected, because I got memory deallocation failure error messages when I
quitted ipython, where I tested my code, which I don't get if I use the
naive for-loop. On the other hand, I can't figure out how to use
PyArray_NewFromDescr correctly because the loaded data I got were not
correct. Anyway, here is how I used PyArray_NewFromDescr:
#----------------------------------------- code
------------------------------------------
        cdef np.npy_intp dims[1]
        dims[0] = lenData
        realData = PyArray_NewFromDescr(numpy.ndarray,
numpy.dtype(np_dtype),
                                        1, dims, NULL, <void *>dblEntry,
                                        NPY_CARRAY|NPY_ENSURECOPY, None)
        free(dblEntry)
#-------------------------------------- end of code
--------------------------------------
BTW,  it can be compiled successfully by cython, in case you are wondering
if the code had all the necessary pieces,

Thank you very much for reading. :-)

Cheers,
Jianbao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20121006/42fb3ecc/attachment.html 


More information about the NumPy-Discussion mailing list