[Numpy-discussion] [ANN] PyTables 1.3.3 released

Francesc Altet faltet at carabos.com
Fri Aug 25 05:11:25 CDT 2006


===========================
 Announcing PyTables 1.3.3
===========================

I'm happy to announce a new minor release of PyTables. In this one, we
have focused on improving compatibility with latest beta versions of
NumPy (0.9.8, 1.0b2, 1.0b3 and higher), adding some improvements and the
typical bunch of fixes (some of them are important, like the possibility
of re-using the same nested class in declaration of table records; see
later).

Go to the PyTables web site for downloading the beast:
http://www.pytables.org/

or keep reading for more info about the new features and bugs fixed.


Changes more in depth
=====================

Improvements:

- Added some workarounds on a couple of 'features' of recent versions of
  NumPy. Now, PyTables should work with a broad range of NumPy versions,
  ranging from 0.9.8 up to 1.0b3 (and hopefully beyond, but let's see).

- When a loop for appending a table is not flushed before the node is
  unbounded (and hence, becomes ``killed`` in PyTables slang), like in::

    import tables as T

    class Item(T.IsDescription):
        name = T.StringCol(length=16)
        vals = T.Float32Col(0.0)

    fileh = T.openFile("/tmp/test.h5", "w")
    table = fileh.createTable(fileh.root, 'table', Item)
    for i in range(100):
        table.row.append()
    #table.flush()  # uncomment this prevent the warning
    table = None  # Unbounding table node!


  a ``PerformanceWarning`` is issued telling the user that it is *much*
  recommended flushing the buffers in a table before unbounding
  it. Hopefully, this will also prevent other scary errors (like
  ``Illegal Instruction``, ``Malloc(): trying to call free() twice``,
  ``Bus Error`` or ``Segmentation fault`` ) that some people is seeing
  lately and which are most probably related with this issue.


Bug fixes:

- In situations where the same metaclass is used for declaring several
  columns in a table, like in::

    class Nested(IsDescription):
        uid = IntCol()
        data = FloatCol()

    class B_Candidate(IsDescription):
        nested1 = Nested()
        nested2 = Nested()

  they were sharing the same column metadata behind the scenes,
  introducing several inconsistencies on it. This has been fixed.

- More work on different padding conventions between
  NumPy/numarray. Now, all trailing spaces in chararrays are
  stripped-off during write/read operations. This means that when
  retrieving NumPy chararrays, it shouldn't appear spureous trailing
  spaces anymore (not even in the context of recarrays). The drawback is
  that you will loose *all* the trailing spaces, no matter if you want
  them in this place or not. This is not a very confortable situation to
  deal with, but hopefully, things will get better when NumPy would be
  at the core of PyTables. In the meanwhile, I hope that the current
  behaviour would be a minor evil for most of situations. This closes
  ticket #13 (again).

- Solved a problem with conversions from numarray charrays to numpy
  objects. Before, when saving numpy chararrays with a declared length
  of N, but none of this components reached such a length, the dtype of
  the numpy chararray retrieved was the maximum length of the component
  strings. This has been corrected.

- Fixed a minor glitch in detection of signedness in IntAtom
  classes. Thanks to Norbert Nemec for reporting this one and providing
  the fix.


Known bugs:

- Using ``Row.update()`` in tables with some columns marked as indexed
  gives a ``NotImplemented`` error although it should not. This is fixed
  in SVN trunk and the functionality will be available in the 1.4.x
  series. Meanwhile, a workaround would be refraining to declare columns
  as indexed and index them *after* the update process (with
  Col.createIndex() for example).


Deprecated features:

- None


Backward-incompatible changes:

- Please, see ``RELEASE-NOTES.txt`` file.


Important note for Windows users
================================

If you are willing to use PyTables with Python 2.4 in Windows platforms,
you will need to get the HDF5 library compiled for MSVC 7.1, aka .NET
2003.  It can be found at:
ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win-net.ZIP

Users of Python 2.3 on Windows will have to download the version of HDF5
compiled with MSVC 6.0 available in:
ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win.ZIP


What it is
==========

PyTables is a package for managing hierarchical datasets and designed to
efficiently cope with extremely large amounts of data (with qsupport for
full 64-bit file addressing).  It features an object-oriented interface
that, combined with C extensions for the performance-critical parts of
the code, makes it a very easy-to-use tool for high performance data
storage and retrieval.

PyTables runs on top of the HDF5 library and numarray (but NumPy and
Numeric are also supported) package for achieving maximum throughput and
convenient use.

Besides, PyTables I/O for table objects is buffered, implemented in C
and carefully tuned so that you can reach much better performance with
PyTables than with your own home-grown wrappings to the HDF5 library.
PyTables sports indexing capabilities as well, allowing doing selections
in tables exceeding one billion of rows in just seconds.


Platforms
=========

This version has been extensively checked on quite a few platforms, like
Linux on Intel32 (Pentium), Win on Intel32 (Pentium), Linux on Intel64
(Itanium2), FreeBSD on AMD64 (Opteron), Linux on PowerPC (and PowerPC64)
and MacOSX on PowerPC.  For other platforms, chances are that the code
can be easily compiled and run without further issues.  Please, contact
us in case you are experiencing problems.


Resources
=========

Go to the PyTables web site for more details:

http://www.pytables.org

About the HDF5 library:

http://hdf.ncsa.uiuc.edu/HDF5/

About numarray:

http://www.stsci.edu/resources/software_hardware/numarray

To know more about the company behind the PyTables development, see:

http://www.carabos.com/


Acknowledgments
===============

Thanks to various the users who provided feature improvements, patches,
bug reports, support and suggestions.  See the ``THANKS`` file in the
distribution package for a (incomplete) list of contributors.  Many
thanks also to SourceForge who have helped to make and distribute this
package!  And last but not least, a big thank you to THG
(http://www.hdfgroup.org/) for sponsoring many of the new features
recently introduced in PyTables.


Share your experience
=====================

Let us know of any bugs, suggestions, gripes, kudos, etc. you may
have.


----

  **Enjoy data!**

  -- The PyTables Team




More information about the Numpy-discussion mailing list