[NumPy-Tickets] [NumPy] #1540: numpy.savez has race condition
NumPy Trac
numpy-tickets@scipy....
Fri Jul 9 06:38:06 CDT 2010
#1540: numpy.savez has race condition
-----------------------+----------------------------------------------------
Reporter: Koen | Owner: somebody
Type: defect | Status: new
Priority: high | Milestone: 2.0.0
Component: numpy.lib | Version: 1.4.0
Keywords: |
-----------------------+----------------------------------------------------
When saving a variable named "test", numpy.savez will always use a
temporary file called /tmp/test.npy. When running multiple programs saving
the same "test" variable at the same time, this leads to race conditions.
I have modified the code (below) to use a proper temporary filename, and
reuse that same temporary file for all variables within the same program.
{{{
def savez(file, *args, **kwds):
"""
Save several arrays into a single, compressed file with extension
".npz"
If keyword arguments are given, the names for variables assigned to
the
keywords are the keyword names (not the variable names in the caller).
If arguments are passed in with no keywords, the corresponding
variable
names are arr_0, arr_1, etc.
Parameters
----------
file : Either the filename (string) or an open file (file-like object)
If file is a string, it names the output file. ".npz" will be
appended
if it is not already there.
args : Arguments
Any function arguments other than the file name are variables to
save.
Since it is not possible for Python to know their names outside
the
savez function, they will be saved with names "arr_0", "arr_1",
and so
on. These arguments can be any expression.
kwds : Keyword arguments
All keyword=value pairs cause the value to be saved with the name
of
the keyword.
See Also
--------
save : Save a single array to a binary file in NumPy format
savetxt : Save an array to a file as plain text
Notes
-----
The .npz file format is a zipped archive of files named after the
variables
they contain. Each file contains one variable in .npy format.
"""
# Import is postponed to here since zipfile depends on gzip, an
optional
# component of the so-called standard library.
import zipfile
if isinstance(file, basestring):
if not file.endswith('.npz'):
file = file + '.npz'
namedict = kwds
for i, val in enumerate(args):
key = 'arr_%d' % i
if key in namedict.keys():
raise ValueError, "Cannot use un-named variables and keyword
%s" % key
namedict[key] = val
zip = zipfile.ZipFile(file, mode="w")
# Place to write temporary .npy files
# before storing them in the zip
import tempfile
(fid, tmpname) = tempfile.mkstemp('.npy', 'numpy')
os.close(fid)
for key, val in namedict.iteritems():
fname = key + '.npy'
fid = open(tmpname,'wb')
format.write_array(fid, np.asanyarray(val))
fid.close()
zip.write(tmpname, arcname=fname)
zip.close()
os.remove(tmpname)
}}}
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/1540>
NumPy <http://projects.scipy.org/numpy>
My example project
More information about the NumPy-Tickets
mailing list