[Numpy-discussion] saving groups of numpy arrays to disk
Wed Aug 24 11:22:10 CDT 2011
On Sun, Aug 21, 2011 at 7:24 AM, Pauli Virtanen <email@example.com> wrote:
> On Sat, 20 Aug 2011 16:18:55 -0700, Chris Withers wrote:
> > I've got a tree of nested dicts that at their leaves end in numpy arrays
> > of identical sizes.
> > What's the easiest way to persist these to disk so that I can pick up
> > with them where I left off?
> Depends on your requirements.
> You can use Python pickling, if you do *not* have a requirement for:
> - real persistence, i.e., being able to easily read the data years later
> - a standard data format
> - access from non-Python programs
> - safety against malicious parties (unpickling can execute some code
> in the input -- although this is possible to control)
> then you can use Python pickling:
> import pickle
> file = open('out.pck', 'wb')
> pickle.dump(file, tree, protocol=pickle.HIGHEST_PROTOCOL)
> file = open('out.pck', 'rb')
> tree = pickle.load(file)
> This should just work (TM) directly with your tree-of-dicts-and-arrays.
> > What's the most "correct" way to do so?
> > I'm using IPython if that makes things easier...
> > I had wondered about PyTables, but that seems a bit too heavyweight for
> > this, unless I'm missing something?
> If I had one or more of the requirements listed above, I'd use the HDF5
> format, via either PyTables or h5py. If I'd just need to cache the trees,
> then I'd use pickling.
> I think the only reason to consider heavy-weighedness is distribution:
> does your target audience have these libraries already installed
> (they are pre-installed in several Python-for-science distributions),
> and how difficult would it be for you to ship them with your stuff,
> or to require the users to install them.
+1 to PyTables or h5py.
> Pauli Virtanen
> NumPy-Discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion