[Numpy-discussion] numpy.array does not take generators

Timothy Hochberg tim.hochberg@ieee....
Fri Aug 17 19:00:24 CDT 2007


On 8/17/07, Barry Wark <barrywark@gmail.com> wrote:
>
> Is there a reason not to add an argument to fromiter that specifies
> the final size of the n-d array? Reading this discussion, I realized
> that there are several places in my code where I create 2-D arrays
> like this:
>
> arr = N.array([d.data() for d in list_of_data_containers]),
>
> where d.data() returns a buffer object.
>
> I would guess that this paradigm causes lots of memory copying. The
> more efficient solution, I think, would be to preallocate the array
> and then assign each row in a loop. It's so much clearer this way,
> however, that I've kept it as is in the code.
>
> So, what if I could do something like
>
> arr = N.fromiter(d.data() for d in list_of_data_containers, shape=(x,y)),


I don't know that there's any theoretical problem in terms of doing
something like this. There are a couple of practical issues though. One is
that it would significantly increase the implementation complexity of
fromiter, which right now is about as simple as it can reasonably be.
Someone would need to step forward and write and test the code. The second
issue is with the interface. The interface that you propose isn't really
right. The current interface is:

   fromiter(iterable, dtype, count=-1)

where count indicates how many items to extract from the iterable (-1
iterates until it is empty). 'shape' as you propose would couple to this in
an unnatural way. Adding another keyword argument that indicates just the
shape of the elements would make more sense, but it starts to seem a bit
clunky.

  fromiter(iterable, dtype, count-1, itemshape=())

For this particular application, there doesn't seem to be any problem simply
defining yourself a little utility function to do this for you.

def from_shaped_iter(iterable, dtype, shape):
    a = numpy.empty(shape, dtype)
    for i, x in enumerate(iterable):
        a[i] = x
    return a

I expect this would have decent performance if y dimension is reasonably
large.

regards,


-tim

with the contract that fromiter will throw an exception if any of the
> d.data() are not of size y or if there are more than x elements in
> list_of_data_containers?
>
> Just a thought for discussion.
>
> barry
>
> On 8/16/07, Robert Kern <robert.kern@gmail.com> wrote:
> > Geoffrey Zhu wrote:
> > > Hi All,
> > >
> > > I want to construct a numpy array based on Python objects. In the
> > > below code, opts is a list of tuples.
> > >
> > > For example,
> > >
> > > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
> > >
> > > If I use a generator like the following:
> > >
> > > K=numpy.array(o[2]/1000.0 for o in opts)
> > >
> > > It does not work.
> > >
> > > I have to use:
> > >
> > > numpy.array([o[2]/1000.0 for o in opts])
> > >
> > > Is this behavior intended?
> >
> > Yes. With arbitrary generators, there is no good way to do the kind of
> > mind-reading that numpy.array() usually does with sequences. It would
> have to
> > unroll the whole generator anyways. fromiter() works for this, but you
> are
> > restricted to 1-D arrays which is a lot easier to implement the
> mind-reading for.
> >
> > --
> > Robert Kern
> >
> > "I have come to believe that the whole world is an enigma, a harmless
> enigma
> >  that is made terrible by our own mad attempt to interpret it as though
> it had
> >  an underlying truth."
> >   -- Umberto Eco
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
.  __
.   |-\
.
.  tim.hochberg@ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070817/5cfb6339/attachment.html 


More information about the Numpy-discussion mailing list