[Numpy-discussion] [ANN] carray 0.2: an in-memory compressed data container

Francesc Alted faltet@pytables....
Fri Aug 27 10:22:53 CDT 2010

Announcing carray 0.2

What it is

carray is a container for numerical data that can be compressed
in-memory.  The compresion process is carried out internally by Blosc,
a high-performance compressor that is optimized for binary data.

Having data compressed in-memory can reduce the stress of the memory
subsystem.  The net result is that carray operations can be faster
than using a traditional ndarray object from NumPy.

What's new

Two new `__iter__()` and `iter(start, stop, step)` iterators that allows
to perform potentially complex operations much faster than using
plain ndarrays.  For example::

  In [3]: a = np.arange(1e6)

  In [4]: time sum((v for v in a if v < 4))
  CPU times: user 6.51 s, sys: 0.00 s, total: 6.51 s
  Wall time: 6.52 s
  Out[5]: 6.0

  In [6]: b = ca.carray(a)

  In [7]: time sum((v for v in b if v < 4))
  CPU times: user 0.73 s, sys: 0.04 s, total: 0.78 s
  Wall time: 0.75 s  # 8.7x faster than ndarray
  Out[8]: 6.0

The `iter(start, stop, step)` iterator also allows to select slices
specified by the `start`, `stop` and `step` parameters.  Example::

  In [9]: time sum((v for v in a[2::3] if v < 10))
  CPU times: user 2.18 s, sys: 0.00 s, total: 2.18 s
  Wall time: 2.19 s
  Out[10]: 15.0

  In [11]: time sum((v for v in b.iter(start=2, step=3) if v < 10))
  CPU times: user 0.26 s, sys: 0.03 s, total: 0.30 s
  Wall time: 0.30 s  # 7.3x faster than ndarray
  Out[12]: 15.0

The main advantage of these iterators is that you can use them in
generators and hence, you don't need to waste memory for creating
temporaries, which can be important when dealing with large arrays.

For more info, see the new ``Using iterators`` section in USAGE.txt.


Visit the main carray site repository at:

You can download a source package from:

Short tutorial:

Home of Blosc compressor:

Share your experience

Let us know of any bugs, suggestions, gripes, kudos, etc. you may



Francesc Alted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100827/6cc405a4/attachment.html 

More information about the NumPy-Discussion mailing list