[Numpy-discussion] Loading a > GB file into array
Fri Dec 21 07:14:25 CST 2007
Am Freitag, 21. Dezember 2007 13:23:49 schrieb David Cournapeau:
> > Instead of saying "memmap is ALL about disc access" I would rather
> > like to say that "memap is all about SMART disk access" -- what I mean
> > is that memmap should run as fast as a normal ndarray if it works on
> > the cached part of an array. Maybe there is a way of telling memmap
> > when and what to cache and when to sync that cache to the disk.
> > In other words, memmap should perform just like a in-pysical-memory
> > array -- only that it once-in-a-while saves/load to/from the disk.
> > Or is this just wishful thinking ?
> > Is there a way of "pre loading" a given part into cache
> > (pysical-memory) or prevent disc writes at "bad times" ?
> > How about doing the sync from a different thread ;-)
> mmap is using the OS IO caches, that's kind of the point of using mmap
> (at least in this case). Instead of doing the caching yourself, the OS
> does it for you, and OS are supposed to be smart about this :)
AFAICS this is what Sebastian wanted to say, but as the OP indicated,
preloading e.g. by reading the whole array once did not work for him.
Thus, I understand Sebastian's questions as "is it possible to help the OS
when it is not smart enough?". Maybe something along the lines of mlock,
only not quite as aggressive.
Ciao, / /
/ / ANS
More information about the Numpy-discussion