[Numpy-discussion] bad generator behaviour with sum
oliphant.travis at ieee.org
Mon Aug 28 01:17:59 CDT 2006
Tom Denniston wrote:
> I was thinking about this in the context of Giudo's comments at scipy
> 2006 that much of the language is moving away from lists toward
> iterators. He gave the keys of a dict as an example.
> Numpy treats iterators, generators, etc as 0x0 PyObjects rather than
> lazy generators of n dimensional data. I guess my question for Travis
> (any others much more expert than I in numpy) is is this intentional
> or is it something that was never implemented because of the obvious
> subtlties of defiing the correct semantics to make this work.
It's not intentional, it's just that iterators came later and I did not
try to figure out how to "do the right thing" in the array function.
Thanks to Tim Hochberg, there is a separate fromiter function that
creates arrays from iterators.
> Personally i find it no big deal to use array(list(iter)) in the 1d
> case and the list function combined with a list comprehension for the
> 2d case. I usually know how many dimensions i expect so i find this
> easy and i know about this peculiar behavior. I find, however, that
> this behavior is very suprising and confusing to the new user and i
> don't usually have a good justification for it to answer them.
The problem is that NumPy arrays need to know both how big they are and
what data-type they are. With iterators you have to basically construct
the whole thing before you can even interrogate that question.
Iterators were not part of the language when Numeric (from which NumPy
got it's code base) was created.
> The ideal semantics, in my mind, would be if an iterator of iterators
> of iterators, etc was no different in numpy than a list of lists of
> lists, etc. But I have no doubt that there are subtleties i am not
> considering. Has anyone more familiar than I with the bowels of numpy
> thought about this problem and see reasons why this is a bad idea or
> just prohibitively difficult to implement?
It's been discussed before and ideas have been considered. Right now,
the fromiter function carries the load. Whether or not to bring that
functionality into the array function itself has been met with hesitancy
because of how bulky the array function already is.
More information about the Numpy-discussion