[Numpy-discussion] reading big-endian uint16 into array on little-endian machine

Sturla Molden sturla@molden...
Fri Jun 18 06:15:29 CDT 2010


Den 17.06.2010 16:29, skrev greg whittier:
> I have files (from an external source) that contain ~10 GB of
> big-endian uint16's that I need to read into a series of arrays.  What
> I'm doing now is
>
> import numpy as np
> import struct
>
> fd = open('file.raw', 'rb')
>
> for n in range(10000)
>      count = 1024*1024
>      a = np.array([struct.unpack('>H', fd.read(2)) for i in range(count)])
>      # do something with a
>
> It doesn't seem very efficient to call struct.unpack one element at a
> time, but struct doesn't have an unpack_farray version like xdrlib
> does.  I also thought of using the array module and .byteswap() but
> the help says it only work on 4 and 8 byte arrays.
>
> Any ideas?
>
>    

t's just a matter of swapping the bytes:

arr = 1D array of uint16
bytes = arr.view(dtype=np.uint8)
tmp = bytes[::2].copy()
bytes[::2] = bytes[1::2]
bytes[1::2] = tmp


Or like this:

arr = 1D array of uint16
arr = (arr >> 8) | (arr << 8)


The latter generates three temporary arrays, the first generates one.

You can avoid this with C:

__declspec(dllexport)
void byteswap(unsigned short *arr, int n)
{
     for (int i=0; i<n; i++) {
         arr[i] = (arr[i] >> 8) | (arr[i] << 8);
     }
}



Sturla










More information about the NumPy-Discussion mailing list