[Numpy-discussion] How to avoid extra copying when forming an array from an iterator

Robert Kern robert.kern@gmail....
Fri Jun 24 15:07:44 CDT 2011


On Fri, Jun 24, 2011 at 14:38, srean <srean.list@gmail.com> wrote:
> A valiant exercise in hope:
>
> Is this possible to do it without a loop or extra copying. What I have is an
> iterator that yields a fixed with string on every call to next(). Now I want
> to create a numpy array of ints out of the last 4 chars of that string.
>
> My plan was to pass the iterator through a generator that returned an
> iterator over the last 4 chars. (sub question: given that strings are
> immutable, is it possible to yield a view of the last 4 chars rather than a
> copy).

Yes, but there isn't much point to it.

> Then apply StringIO.writelines() on the 2-char iterator returned.
> After its done, create a numpy.array from the StringIO's buffer.
>
> This does not work, the other option is to use an array.array in place of a
> StringIO object. But is it possible to fill an array.array using a lazy
> iterator without an explicit loop in python. Something like the writelines()
> call

Your generator that is yielding the last four bytes of the string
fragments is already a Python loop. Adding another is not much
additional overhead. But you can also be clever

  a = array.array('c')
  map(a.extend, your_generator_of_4strings)
  b = np.frombuffer(a, dtype=np.int32)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


More information about the NumPy-Discussion mailing list