[Numpy-discussion] How to start at line # x when using numpy.memmap

Brent Pedersen bpederse@gmail....
Fri Aug 19 09:01:06 CDT 2011

On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin <jlconlin@gmail.com> wrote:
> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen <pav@iki.fi> wrote:
>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
>>> I would like to use numpy's memmap on some data files I have. The first
>>> 12 or so lines of the files contain text (header information) and the
>>> remainder has the numerical data. Is there a way I can tell memmap to
>>> skip a specified number of lines instead of a number of bytes?
>> First use standard Python I/O functions to determine the number of
>> bytes to skip at the beginning and the number of data items. Then pass
>> in `offset` and `shape` parameters to numpy.memmap.
> Thanks for that suggestion. However, I'm unfamiliar with the I/O
> functions you are referring to. Can you point me to do the
> documentation?
> Thanks again,
> Jeremy
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

this might get you started:

import numpy as np

# make some fake data with 12 header lines.
with open('test.mm', 'w') as fhw:
    print >> fhw, "\n".join('header' for i in range(12))
    np.arange(100, dtype=np.uint).tofile(fhw)

# use normal python io to determine of offset after 12 lines.
with open('test.mm') as fhr:
    for i in range(12): fhr.readline()
    offset = fhr.tell()

# use the offset in your call to np.memmap.
a = np.memmap('test.mm', mode='r', dtype=np.uint, offset=offset)

assert all(a == np.arange(100))

More information about the NumPy-Discussion mailing list