[Numpy-discussion] Efficient reading of binary data

Robert Kern robert.kern@gmail....
Thu Apr 3 19:00:34 CDT 2008

On Thu, Apr 3, 2008 at 6:53 PM, Nicolas Bigaouette
<nbigaouette@gmail.com> wrote:
> Thanx for the fast response Robert ;)
> I changed my code to use the slice:
>  E = data[6::9]It is indeed faster and less eat less memory. Great.
> Thanx for the endiannes! I knew there was something like this ;) I suspect
> that, in '>f8', "f" means float and "8" means 8 bytes?

Yes, and the '>' means big-endian. '<' is little-endian, and '=' is

> From some benchmarks, I see that the slowest thing is disk access. It can
> slow the displaying of data from around 1sec (when data is in os cache or
> buffer) to 8sec.
> So the next step would be to only read the needed data from the binary
> file... Is it possible to read from a file with a slice? So instead of:
> data = numpy.fromfile(file=f, dtype=float_dtype, count=9*Stot)
> E = data[6::9]
> maybe something like:
> E = numpy.fromfile(file=f, dtype=float_dtype, count=9*Stot, slice=6::9)

Instead of reading using fromfile(), you can try memory-mapping the array.

  from numpy import memmap
  E = memmap(f, dtype=float_dtype, mode='r')[6::9]

That may or may not help. At least, it should decrease the latency
before you start pulling out frames.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco

More information about the Numpy-discussion mailing list