[Numpy-discussion] reading *big* inhomogenous text matrices *fast*?
Wed Aug 13 23:40:16 CDT 2008
On Wed, 13 Aug 2008 22:11:07 -0400, Zachary Pincus wrote:
> Try profiling the code just to make sure that it is the list append
> that's slow, and not something else happening on that line, e.g..
>From what you and others have pointed out, I'm pretty sure I must have
been doing something else wrong although my code wasn't in SVN yet so I'm
not sure exactly what.
> It appears to be the general consensus on this mailing list that the
> best solution when an expandable array is required is to append to a
> python list, and then once you've built it up completely, convert it to
> an array. So I'm at least surprised that this is turning out to be so
> slow for you... But if the profiler says that's where the trouble is,
> then so it is...
That does seem to be the standard idiom used by NumPy, such as in
loadtxt. And loadtxt is usually fast enough for me.
> Actually, my suggestion was to compare building up a list-of-lists and
> then converting that to a 2d array versus building up a list-of- arrays,
> and then converting that to a 2d array... one might wind up being faster
> or more memory-efficient than the other...
I assume that list-of-arrays is more memory-efficient since array
elements don't have the overhead of full-blown Python objects. But list-
of-lists is probably more time-efficient since I think it's faster to
convert the whole array at once than do it row-by-row.
More information about the Numpy-discussion