[Numpy-discussion] genfromtxt behaviour

Pierre GM pgmdevlist@gmail....
Fri Oct 29 07:44:58 CDT 2010


On Oct 29, 2010, at 2:35 PM, Matt Studley wrote:

> Hi all
> 
> first, please forgive me for my ignorance - I am taking my first
> stumbling steps with numpy and scipy.

No problem, it;s educational

> I am having some difficulty with the behaviour of genfromtxt.
> 
> s = SIO.StringIO("""1, 2, 3
> 4, 5, 6
> 7, 8, 9""")
> g= genfromtxt(s, delimiter=', ', dtype=None)
> print g[:,0]
> 
> 
> This produces the output I expected, a slice for column 0...
> 
> array([1, 4, 7])
> 
> 
> BUT....
> 
> s = SIO.StringIO("""a, 2, 3
> b, 5, 6
> c, 8, 9""")
> g= genfromtxt(s, delimiter=', ', dtype=None)
> g[:,0]
> 
> Produces the following error:
> 
> IndexError: invalid index
> 
> 
> ....
> 
> In the first case, genfromtxt returns me a 2d array (list of lists),

Well, not exactly. 
When you use dtype=None, genfromtxt tries to guess the type of the columns.
Because in this case all your variables can be safely casted to integers, genfromtxt considers that the dtype is uniform and it outputs a 2D array.



> in the second it returns a list of tuples with an associated dtype
> list of tuples.

Well, in this case, the first column is detected to be of type string, while the other columns are of type integers (as the dtype shows you).
Therefore,  genfromtxt considers that the dtype is structured and the output is a 1D array (each line is a tuple of 3 elements, the first one '|S1', the 2nd and 3rd ints

> How can I do my nice 2d slicing on the latter?
> 
> array([('a', 2, 3), ('b', 5, 6), ('c', 8, 9)],
>      dtype=[('f0', '|S1'), ('f1', '<i4'), ('f2', '<i4')])

Select a column by its name:
yourarray['f0']

More info:
http://docs.scipy.org/doc/numpy/user/basics.rec.html
http://docs.scipy.org/doc/numpy/user/basics.io.genfromtxt.html







More information about the NumPy-Discussion mailing list