[Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

Daniel Lenski dlenski@gmail....
Wed Aug 13 23:40:16 CDT 2008


On Wed, 13 Aug 2008 22:11:07 -0400, Zachary Pincus wrote:
> Try profiling the code just to make sure that it is the list append
> that's slow, and not something else happening on that line, e.g..

>From what you and others have pointed out, I'm pretty sure I must have 
been doing something else wrong although my code wasn't in SVN yet so I'm 
not sure exactly what.

> It appears to be the general consensus on this mailing list that the
> best solution when an expandable array is required is to append to a
> python list, and then once you've built it up completely, convert it to
> an array. So I'm at least surprised that this is turning out to be so
> slow for you... But if the profiler says that's where the trouble is,
> then so it is...

That does seem to be the standard idiom used by NumPy, such as in 
loadtxt.  And loadtxt is usually fast enough for me.

> Actually, my suggestion was to compare building up a list-of-lists and
> then converting that to a 2d array versus building up a list-of- arrays,
> and then converting that to a 2d array... one might wind up being faster
> or more memory-efficient than the other...

I assume that list-of-arrays is more memory-efficient since array 
elements don't have the overhead of full-blown Python objects.  But list-
of-lists is probably more time-efficient since I think it's faster to 
convert the whole array at once than do it row-by-row.

Dan



More information about the Numpy-discussion mailing list