[AstroPy] question on matplotlib's loadtxt
Derek Homeier
derek@astro.physik.uni-goettingen...
Fri Nov 4 10:14:15 CDT 2011
Hi Grigoris,
> Hello to all! I would like to ask about an error I face with
> matplotlib's loadtxt...
>
> Suppose that we have a file (called "test.test") with 6 columns like:
>
> 1496548. 862.235400937 14.008 0 20110523.201416 tth*g
> 1690289. 919.424007603 13.875 0 20110523.221527 hgf4
> 1667241. 996.262754039 13.890 0 20110524.001639 nb.mj
> 739181.6 881.1753527 14.774 0 20110524.010203 vbfhg
>
> When I use
> x, y = loadtxt('test.test', unpack=True, usecols=(0,1), dtype=(float,float))
>
> it prints normally the first and second column.
> If I add another column like:
>
> x, y, z = loadtxt('test.test', unpack=True,usecols=(0,1,3),
> dtype=(float,float,float))
>
> the output is this:
> File "/usr/lib/python2.7/site-packages/numpy/lib/io.py", line 584, in
> loadtxt
> dtype = np.dtype(dtype)
> TypeError: data type not understood
>
I don't quit understand that part, I admit - dtype only takes a single type or pairs of names and
types, so it would seem the first case actually only worked by accident:
In [193]: np.dtype((float,float))
Out[193]: dtype('float64')
I have no idea how it interprets the second float, but as you see it has no effect anyway.
You could in fact get the same result with 'usecols=[0,1,3], dtype=float'
which will simply apply the dtype to all columns (and is the default already...).
>
> Even if I use a dictionary, like:
> dt = dtype({'names':['n1','n2','n3'],'formats':[float,float,float]})
> x, y, z = loadtxt('test.test', unpack=True, usecols=(0,1,2) , dtype=dt)
>
> the output is the same as previous.
>
No things are getting strange - the above is the correct (and I think, only) way to
define dtypes with different formats -
np.dtype([('n1',float),('n2',float),('n3',float)])
would be equivalent. As there is no error on the first line (i.e., dt is a valid dtype),
this seems to be a bug in loadtxt. Could you check what happens if you try the
above without the 'unpack' option? Should return a structured array with the 3
fields 'n1','n2','n3'...
Also (as loadtxt is actually part of numpy), which is your installed version of numpy
(numpy.__version__ or np.__version__, you may have to re-import it as pylab
unfortunately imports everything into one namespace). I recall that there have
been issues with unpacking structured arrays prior to 1.6 or 1.6.1.
> Additionaly, when I am trying to load the last column as strings:
> x, y = loadtxt('test.test', delimiter=' ', unpack=True,usecols=(0,5),
> dtype=(float,'S5'))
>
> I get this error:
> File "/usr/lib/python2.7/site-packages/numpy/lib/io.py", line 584, in
> loadtxt
> dtype = np.dtype(dtype)
> ValueError: mismatch in size of old and new data-descriptor
>
>
> What am I doing wrong ? I am sorry if it is something really obvious ...
> but I am unable to find it...
Similar problem - this ought to work with dtype=[('n1', 'f'), ('n6','S5')] ; but it might
be broken in the same way as above - if you can update to numpy 1.6.x, you
should hopefully be set.
HTH,
Derek
