[SciPy-user] Automatically making a dtype

Leo Trottier trottier+pylist@gmail....
Wed Jul 8 19:31:42 CDT 2009


So, perhaps I'm the only one, but I find using numpy dtypes can be a bit
more troublesome than I typically expect from Python libraries.  So I've gone
ahead and written a little function that, given an "exemplar" (e.g., a
row from your data set) will create a dtype based on it.

Anyone think that something like this should make it into the
numpy/scipy distribution?  Also, anyone want to improve the function
so it can handle tuples, sub-arrays, etc?

Anyway, here it is:

def makeDType(exemplar):
   '''Return a dtype object based on the given list or dict *exemplar*

   This is a convenience function -- if you want to do anything sophisticated
   it's best to compose the dtype "by hand".

   If given a list, this will return a dtype with fields ordered in the same
   sequence as in exemplar.

   If given a dict, the field ordering will be alphabetical, based on the
   names of the fields.

   NB: any str example you give it should be the longest you can imagine,
   as the function will return a field based on that length.

   >>> makeDType(['a string', 4, 3.0, 3j, True, None, eval, u'asdf'])
   dtype([('f0', '|S8'), ('f1', '<i4'), ('f2', '<f8'), ('f3', '<c16'),
   ('f4', '|b1'), ('f5', '|O4'), ('f6', '|O4'), ('f7', '<U4')]))

   >>> makeDType(dict(a='0123',b=3.,c=4,d=True,e=3j,f=eval,g=None,h=u'asdf'))
   dtype([('a', '|S4'), ('b', '<f8'), ('c', '<i4'), ('d', '|b1'), ('e',
   '<c16'), ('f', '|O4'), ('g', '|O4'), ('h', '<U4')])

      This can't yet handle tuples or sub-arrays, and it's not
      smart enough to figure out how big to make each float, int, etc.

   This code is put into the public domain.
   import numpy as np
   if type(exemplar) is dict:
       names = exemplar.keys(); names.sort()
       formats = [np.array(exemplar[key]).dtype for key in names]
       return np.dtype({'names':names, 'formats':formats})
       formats = ','.join([np.array(val).dtype.str for val in exemplar])
       return np.dtype(formats)

More information about the SciPy-user mailing list