[Numpy-discussion] fromiter

Tim Hochberg tim.hochberg at cox.net
Fri Jun 2 22:15:33 CDT 2006


Some time ago some people, myself including, were making some noise 
about having 'array' iterate over iterable object producing ndarrays in 
a manner analogous to they way sequences are treated. I finally got 
around to looking at it seriously and once I came to the following three 
conclusions:

   1. All I really care about is the 1D case where dtype is specified.
      This case should be relatively easy to implement so that it's
      fast. Most other cases are not likely to be particularly faster
      than converting the iterators to lists at the Python level and
      then passing those lists to array.
   2. 'array' already has plenty of special cases. I'm reluctant to add
      more.
   3. Adding this to 'array' would be non-trivial. The more cases we
      tried to make fast, the more likely that some of the paths would
      be buggy. Regardless of how we did it though, some cases would be
      much slower than other, which would probably be suprising.

So, with that in mind, I retreated a little and implemented the simplest 
thing that did the stuff that I cared about:

    fromiter(iterable, dtype, count) => ndarray of type dtype and length
    count

This is essentially the same interface as fromstring except that the 
values of dtype and count are always required. Some primitive 
benchmarking indicates that 'fromiter(generator, dtype, count)' is about 
twice as fast as 'array(list(generator))' for medium to large arrays. 
When producing very large arrays, the advantage of fromiter is larger, 
presumably because 'list(generator)' causes things to start swapping.

Anyway I'm about to bail out of town till the middle of next week, so 
it'll be a while till I can get it clean enough to submit in some form 
or another. Plenty of time for people to think of why it's a terrible 
idea ;-)

-tim






More information about the Numpy-discussion mailing list