[Numpy-discussion] memory-mapped Numeric arrays: arrayfrombuffer version 2
mathew at fugue.jpl.nasa.gov
Mon Feb 18 12:26:58 CST 2002
Has anyone checked out VMaps at http://snafu.freedom.org/Vmaps/ ??
This might be what you're looking for.
> (I thought I had sent this mail on January 30, but I guess I was
> Eric Nodwell writes:
> > Since I have a 2.4GB data file handy, I thought I'd try this
> > package with it. (Normally I process this data file by reading
> > it in a chunk at a time, which is perfectly adequate.) Not
> > surprisingly, it chokes:
> Yep, that's pretty much what I expected. I think that adding code to
> support mapping some arbitrary part of a file should be fairly
> straightforward --- do you want to run the tests if I write the code?
> > File "/home/eric/lib/python2.2/site-packages/maparray.py", line 15,
> > in maparray
> > m = mmap.mmap(fn, os.fstat(fn)[stat.ST_SIZE])
> > OverflowError: memory mapped size is too large (limited by C int)
> This error message's wording led me to something that was *not* what I
> That's a sort of alarming message --- it suggests that it won't work
> on >2G files even on LP64 systems, where longs and pointers are 64
> bits but ints are 32 bits. The comments in the mmap module say:
> The map size is restricted to [0, INT_MAX] because this is the current
> Python limitation on object sizes. Although the mmap object *could* handle
> a larger map size, there is no point because all the useful operations
> (len(), slicing(), sequence indexing) are limited by a C int.
> Horrifyingly, this is true. Even the buffer interface function
> arrayfrombuffer uses to get the size of the buffer return int sizes,
> not size_t sizes. This is a serious bug in the buffer interface, IMO,
> and I doubt it will be fixed --- the buffer interface is apparently
> due for a revamp soon at any rate, so little changes won't be
> welcomed, especially if they break binary backwards compatibility, as
> this one would on LP64 platforms.
> Fixing this, so that LP64 Pythons can mmap >2G files (their
> birthright!), is a bit of work --- probably a matter of writing a
> modified mmap() module that supports a saner version of the buffer
> interface (with named methods instead of a type object slot), and
> can't be close()d, to boot.
> Until then, this module only lets you memory-map files up to two gigs.
> > (details: Python 2.2, numpy 20.3, Pentium III, Debian Woody, Linux
> > kernel 2.4.13, gcc 2.95.4)
> My kernel is 2.4.13 too, but I don't have any large files, and I don't
> know whether any of my kernel, my libc, or my Python even support
> > I'm not a big C programmer, but I wonder if there is some way for
> > this package to overcome the 2GB limit on 32-bit systems. That
> > could be useful in some situations.
> I don't know, but I think it would probably require extensive code
> changes throughout Numpy.
> <kragen at pobox.com> Kragen Sitaker <http://www.pobox.com/~kragen/>
> The sages do not believe that making no mistakes is a blessing. They believe,
> rather, that the great virtue of man lies in his ability to correct his
> mistakes and continually make a new man of himself. -- Wang Yang-Ming
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
More information about the Numpy-discussion