[Numpy-discussion] numpy and dtype

humufr at yahoo.fr humufr at yahoo.fr
Thu Aug 31 10:43:59 CDT 2006


		Hi, 


sorry to bother you with probably plenty of stupid question but I would like 
to clarify my mind with dtype.

I have a problem to view a recarray, I'm not sure but I suspect a bug or at 
least a problem

I have an array with some data, the array is very big but I have no problem 
with numpy.

In [44]: data_end
Out[44]:
array([[  2.66000000e+02,   5.16300000e+04,   1.00000000e+00, ...,
         -1.04130435e+00,   1.47304565e+02,   4.27402449e+00],
       [  2.66000000e+02,   5.16300000e+04,   2.00000000e+00, ...,
         -6.52190626e-01,   1.64214981e+02,   1.58334379e+01],
       [  2.66000000e+02,   5.16300000e+04,   4.00000000e+00, ...,
         -7.65136838e-01,   1.33340195e+02,   9.84033298e+00],
       ...,
       [  9.78000000e+02,   5.24310000e+04,   6.32000000e+02, ...,
          3.06083832e+01,   6.71210251e+01,   1.18813887e+01],
       [  9.78000000e+02,   5.24310000e+04,   6.36000000e+02, ...,
          3.05993423e+01,   1.10403000e+02,   5.81539488e+00],
       [  9.78000000e+02,   5.24310000e+04,   6.40000000e+02, ...,
          3.05382938e+01,   1.26916304e+01,   3.25683937e+01]])

In [45]: data_end.shape
Out[45]: (567486, 7)

In [46]: data_end.dtype
Out[46]: dtype('<f8')


I want to have this array with certain dtype so I transform it in recarray (I 
have unable ot have different type for each columns in the array but I'm just 
beginning to play with dtype.

So I did:

In [47]: fields = 
['PLATEID', 'MJD', 'FIBERID', 'RA', 'DEC','V_DISP','V_DISP_ERR']

In [48]:

In [48]: type_descr = numpy.dtype({'names':fields,'formats':
['>i2','>i4','>i2','>f4','>f4','>f4','>f4']})

In [49]: b = numpy.rec.fromarrays(data_end.transpose(),type_descr)

In [50]: b[:1]
Out[50]:
recarray([ (266, 51630, 1, 146.71420288085938, -1.041304349899292, 
147.3045654296875, 4.274024486541748)],
      dtype=[('PLATEID', '>i2'), ('MJD', '>i4'), ('FIBERID', '>i2'), 
('RA', '>f4'), ('DEC', '>f4'), ('V_DISP', '>f4'), ('V_DISP_ERR', '>f4')])

In [51]: b[1]
Out[51]: (266, 51630, 2, 146.74412536621094, -0.65219062566757202, 
164.21498107910156, 15.833437919616699)

but I obtain an error when I want to print the recarray b (it's working for 
smallest array):

In [54]: b[:10]
Out[54]:
recarray([ (266, 51630, 1, 146.71420288085938, -1.041304349899292, 
147.3045654296875, 4.274024486541748),
       (266, 51630, 2, 146.74412536621094, -0.65219062566757202, 
164.21498107910156, 15.833437919616699),
       (266, 51630, 4, 146.62857055664062, -0.76513683795928955, 
133.34019470214844, 9.8403329849243164),
       (266, 51630, 6, 146.63166809082031, -0.98827779293060303, 
146.91035461425781, 30.08709716796875),
       (266, 51630, 7, 146.91944885253906, -0.99049174785614014, 
152.96893310546875, 12.429832458496094),
       (266, 51630, 9, 146.76339721679688, -0.81043314933776855, 
347.72918701171875, 41.387767791748047),
       (266, 51630, 10, 146.62281799316406, -0.9513852596282959, 
162.53567504882812, 24.676788330078125),
       (266, 51630, 11, 146.93409729003906, -0.67040395736694336, 
266.56011962890625, 10.875675201416016),
       (266, 51630, 12, 146.96389770507812, -0.54500257968902588, 
92.040328979492188, 18.999214172363281),
       (266, 51630, 13, 146.9635009765625, -0.75935173034667969, 
72.828048706054688, 13.028598785400391)],
      dtype=[('PLATEID', '>i2'), ('MJD', '>i4'), ('FIBERID', '>i2'), 
('RA', '>f4'), ('DEC', '>f4'), ('V_DISP', '>f4'), ('V_DISP_ERR', '>f4')])


So I would like to know if it's normal. And another question is it possile to 
do, in theory, something like:

b = numpy.array(data_end,dtype=type_descr) 

or all individual array element must have the same dtype?

To replace the context, I have a big fits table, I want to use only some 
columns from the table so I did:

table = pyfits.getdata('gal_info_dr4_v5_1b.fit') #pyfits can't read, at least 
now the gzip file

#the file is a fits table file so we look in the pyfits doc to read it!

fields = ['PLATEID', 'MJD', 'FIBERID', 'RA', 'DEC','V_DISP','V_DISP_ERR']

type_descr = numpy.dtype({'names':fields,'formats':
['<i2','<i4','<i2','<f8','<f8','<f8','<f8']})

data_end = numpy.zeros((table.shape[0],len(fields)))
for i in range(len(fields)):
    data_end[:,i] = table[:].field(fields[i])


but I want to keep the type from the fits file for each field perhaps there 
are a better way to do it. 

Thank you very much. 


N.




The error when I try to print the big recarray:

In [53]: b
---------------------------------------------------------------------------
exceptions.TypeError                                 Traceback (most recent 
call last)

/home/gruel/Desktop/SDSS/DR4/<ipython console>

/home/gruel/usr/lib/python2.4/site-packages/IPython/Prompts.py in 
__call__(self, arg)
    514
    515             # and now call a possibly user-defined print mechanism
--> 516             manipulated_val = self.display(arg)
    517
    518             # user display hooks can change the variable to be stored 
in

/home/gruel/usr/lib/python2.4/site-packages/IPython/Prompts.py in 
_display(self, arg)
    538         """
    539
--> 540         return self.shell.hooks.result_display(arg)
    541
    542     # Assign the default display method:

/home/gruel/usr/lib/python2.4/site-packages/IPython/hooks.py in __call__(self, 
*args, **kw)
    132             #print "prio",prio,"cmd",cmd #dbg
    133             try:
--> 134                 ret = cmd(*args, **kw)
    135                 return ret
    136             except ipapi.TryNext, exc:

/home/gruel/usr/lib/python2.4/site-packages/IPython/hooks.py in 
result_display(self, arg)
    153
    154     if self.rc.pprint:
--> 155         out = pformat(arg)
    156         if '\n' in out:
    157             # So that multi-line strings line up with the left column 
of

/usr/lib/python2.4/pprint.py in pformat(self, object)
    108     def pformat(self, object):
    109         sio = _StringIO()
--> 110         self._format(object, sio, 0, 0, {}, 0)
    111         return sio.getvalue()
    112

/usr/lib/python2.4/pprint.py in _format(self, object, stream, indent, 
allowance, context, level)
    126             self._readable = False
    127             return
--> 128         rep = self._repr(object, context, level - 1)
    129         typ = _type(object)
    130         sepLines = _len(rep) > (self._width - 1 - indent - allowance)

/usr/lib/python2.4/pprint.py in _repr(self, object, context, level)
    192     def _repr(self, object, context, level):
    193         repr, readable, recursive = self.format(object, 
context.copy(),
--> 194                                                 self._depth, level)
    195         if not readable:
    196             self._readable = False

/usr/lib/python2.4/pprint.py in format(self, object, context, maxlevels, 
level)
    204         and whether the object represents a recursive construct.
    205         """
--> 206         return _safe_repr(object, context, maxlevels, level)
    207
    208

/usr/lib/python2.4/pprint.py in _safe_repr(object, context, maxlevels, level)
    289         return format % _commajoin(components), readable, recursive
    290
--> 291     rep = repr(object)
    292     return rep, (rep and not rep.startswith('<')), False
    293

/home/gruel/usr/lib/python2.4/site-packages/numpy/core/numeric.py in 
array_repr(arr, max_line_width, precision, suppress_small)
    389     if arr.size > 0 or arr.shape==(0,):
    390         lst = array2string(arr, max_line_width, precision, 
suppress_small,
--> 391                            ', ', "array(")
    392     else: # show zero-length shape unless it is (0,)
    393         lst = "[], shape=%s" % (repr(arr.shape),)

/home/gruel/usr/lib/python2.4/site-packages/numpy/core/arrayprint.py in 
array2string(a, max_line_width, precision, suppress_small, separator, prefix, 
style)
    202     else:
    203         lst = _array2string(a, max_line_width, precision, 
suppress_small,
--> 204                             separator, prefix)
    205     return lst
    206

/home/gruel/usr/lib/python2.4/site-packages/numpy/core/arrayprint.py in 
_array2string(a, max_line_width, precision, suppress_small, separator, 
prefix)
    137     if a.size > _summaryThreshold:
    138         summary_insert = "..., "
--> 139         data = _leading_trailing(a)
    140     else:
    141         summary_insert = ""

/home/gruel/usr/lib/python2.4/site-packages/numpy/core/arrayprint.py in 
_leading_trailing(a)
    108     if a.ndim == 1:
    109         if len(a) > 2*_summaryEdgeItems:
--> 110             b = _gen.concatenate((a[:_summaryEdgeItems],
    111                                      a[-_summaryEdgeItems:]))
    112         else:

TypeError: expected a readable buffer object
Out[53]:




More information about the Numpy-discussion mailing list