[Numpy-discussion] NPZ format

Robert Kern robert.kern@gmail....
Fri Oct 1 18:22:14 CDT 2010


On Fri, Oct 1, 2010 at 02:13, Francesc Alted <faltet@pytables.org> wrote:
> A Thursday 30 September 2010 18:20:16 Robert Kern escrigué:
>> On Wed, Sep 29, 2010 at 03:17, Francesc Alted <faltet@pytables.org>
> wrote:
>> > Hi,
>> >
>> > I'm going to give a seminar about serialization, and I'd like to
>> > describe the .npy format.  I noticed that there is a variant of it
>> > called .npz that can pack several arrays in one single file.
>> >
>> > However, .npz does not use compression at all and I'm wondering
>> > what's the reason.  I suppose that this is because you don't want
>> > to loose the possibility to memmap saved arrays, but can someone
>> > confirm this?
>>
>> While I suspect it's possible, I'm certain we don't have any code
>> that actually does it. Most likely the author assumed that it would
>> be faster (or tested it to be faster with their CPU/hard disk
>> configuration) to not compress.
>
> Thanks, that's good to know.  And yes, I'd say that compressing with zip
> (zlib) would reduce performance for doing I/O, but most probably
> decompressing from disk media would represent an improvement in terms of
> time.  At any rate, adding compression capability to .npy should be just
> one parameter away, so perhaps is a good idea adding it.

Also some design, documentation, format version bump, and (not least)
code away. ;-)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


More information about the NumPy-Discussion mailing list