[Numpy-discussion] Large symmetrical matrix

Robert Kern robert.kern@gmail....
Wed Jun 11 15:34:12 CDT 2008


On Wed, Jun 11, 2008 at 15:14, Simon Palmer <simon.palmer@gmail.com> wrote:
> Actually that is very close to what I currently do, which is why I want to
> throw it away.
>
> The factor of two memory hit is not really the problem, it is the scaling
> O(N^2) which is the limitation
>
> I suspect I may end up using the 1-d array plus arithmetic as it is at least
> efficient for retrieval.  I have just started working on a stack which sits
> between the vector array and a lookup for a minimum, I think that could
> improve performance a great deal.  Next up is a disk based memory map of
> some kind which might free me from the constraints of physical memory.
> Anyone know of an implementation of array() which automatically reads/writes
> to disk?

Well, if you have a decent OS, you should already be free of the
constraints of physical memory. The OS should swap data in and out as
necessary. But of course, we do have mmap support with numpy.memmap.
If you are using numpy 1.1.0, I would also recommend using
numpy.lib.format.open_memmap() to use the new NPY file format in order
to (hopefully) future-proof your data. Be careful with using mmap'ed
arrays, though. Disk seek time could kill your performance.

If this is the big hotspot in your program, you might want to consider
coding up exactly the behavior you need in C. It doesn't appea

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco


More information about the Numpy-discussion mailing list