[Numpy-discussion] How to start at line # x when using numpy.memmap

Jeremy Conlin jlconlin@gmail....
Fri Aug 19 10:26:02 CDT 2011


On Fri, Aug 19, 2011 at 9:23 AM, Warren Weckesser
<warren.weckesser@enthought.com> wrote:
>
>
> On Fri, Aug 19, 2011 at 10:09 AM, Jeremy Conlin <jlconlin@gmail.com> wrote:
>>
>> On Fri, Aug 19, 2011 at 8:01 AM, Brent Pedersen <bpederse@gmail.com>
>> wrote:
>> > On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin <jlconlin@gmail.com>
>> > wrote:
>> >> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen <pav@iki.fi> wrote:
>> >>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
>> >>>> I would like to use numpy's memmap on some data files I have. The
>> >>>> first
>> >>>> 12 or so lines of the files contain text (header information) and the
>> >>>> remainder has the numerical data. Is there a way I can tell memmap to
>> >>>> skip a specified number of lines instead of a number of bytes?
>> >>>
>> >>> First use standard Python I/O functions to determine the number of
>> >>> bytes to skip at the beginning and the number of data items. Then pass
>> >>> in `offset` and `shape` parameters to numpy.memmap.
>> >>
>> >> Thanks for that suggestion. However, I'm unfamiliar with the I/O
>> >> functions you are referring to. Can you point me to do the
>> >> documentation?
>> >>
>> >> Thanks again,
>> >> Jeremy
>> >> _______________________________________________
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion@scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >>
>> >
>> > this might get you started:
>> >
>> >
>> > import numpy as np
>> >
>> > # make some fake data with 12 header lines.
>> > with open('test.mm', 'w') as fhw:
>> >    print >> fhw, "\n".join('header' for i in range(12))
>> >    np.arange(100, dtype=np.uint).tofile(fhw)
>> >
>> > # use normal python io to determine of offset after 12 lines.
>> > with open('test.mm') as fhr:
>> >    for i in range(12): fhr.readline()
>> >    offset = fhr.tell()
>> >
>> > # use the offset in your call to np.memmap.
>> > a = np.memmap('test.mm', mode='r', dtype=np.uint, offset=offset)
>>
>> Thanks, that looks good. I tried it, but it doesn't get the correct
>> data. I really don't understand what is going on. A simple code and
>> sample data is attached if anyone has a chance to look at it.
>
>
> Your data file is all text.  memmap is generally for binary data; it won't
> work with this file.
>
> Warren

Yikes! I missed the "binary" in the first line of the documentation. Sorry!

Jeremy


More information about the NumPy-Discussion mailing list