[SciPy-user] Fast saving/loading of huge matrices
Thu Apr 19 14:42:32 CDT 2007
El dj 19 de 04 del 2007 a les 14:19 -0500, en/na Ryan Krauss va
> I just changed from simply reading a text file using io.read_array to
> cPickle and got a factor of 4 or 5 speed up for my medium sized array.
> But the cPickle file is quite large (about twice the size of the
> ascii file - I don't think the ascii has very many digits).
Yeah. This can be expected because pickle saves the complete set of
digits in binary form (8 bytes for double precisition, while if you keep
only 2 digits (+ the decimal point + a space) you will need only 4 bytes
for your data, hence the space savings.
> I thought there used to be some built in functions called something
> like shelve that stored dictionaries fairly quickly and compactly.
> Are those functions still around and I am just remembering the name
> wrong? Or have they been done away with? I remember vaguely that
> they stored data in 3 seperate files - a python file that could later
> be imported, a dat file (I think) and something else.
> The cPickle approach seems fast, I just wish there was some way to
> make the files smaller. Is there a good way to do this that doesn't
> slow down the read time too much?
Try using compression. If your data doesn't have many decimals, chances
are that it can be easily compressed up to 3x. There are many
compressors that have a Python interface (your best bet is to use the
zlib module included in Python). Or try PyTables for transparent
Francesc Altet | Be careful about using the following code --
Carabos Coop. V. | I've only proven that it works,
www.carabos.com | I haven't tested it. -- Donald Knuth
More information about the SciPy-user