[Numpy-discussion] load from text files Pull Request Review

Chris.Barker Chris.Barker@noaa....
Tue Sep 13 13:41:46 CDT 2011


On 9/12/11 4:38 PM, Christopher Jordan-Squire wrote:
> I did some timings to see what the advantage would be, in the simplest
> case possible, of taking multiple lines from the file to process at a
> time.

Nice work, only a minor comment:
> f6 and f7 use stripped down versions of Chris
> Barker's accumulator idea. The difference is that f6 uses resize when
> expanding the array while f7 uses np.empty followed by np.append. This
> avoids the penalty from copying data that np.resize imposes.

I don't think it does:

"""
In [3]: np.append?
----------
arr : array_like
     Values are appended to a copy of this array.

Returns
-------
out : ndarray
     A copy of `arr` with `values` appended to `axis`.  Note that `append`
     does not occur in-place: a new array is allocated and filled.
"""

There is no getting around the copying. However, I think resize() uses 
the OS memory re-allocate call, which may, in some instances, have 
over-allocated the memory in the first place, and thus not require a 
copy. So I'm pretty sure ndarray.resize is as good as it gets.

> f6 : 3.26ms
> f7 : 2.77ms (Apparently it's a lot cheaper to do np.empty followed by
> append then do to resize)

Darn that profiling proving my expectations wrong again! though I'm 
really confused as to how that could be!

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov


More information about the NumPy-Discussion mailing list