[SciPy-User] Suggestion for numpy.genfromtxt documentation
Wed Oct 7 14:20:18 CDT 2009
On 10/07/2009 10:52 AM, Skipper Seabold wrote:
> On Wed, Oct 7, 2009 at 11:25 AM, Dharhas Pothina
> <Dharhas.Pothina@twdb.state.tx.us> wrote:
>> It took me a while and a lot of trial and error to work out why this didn't work as expected.
>> data = np.genfromtxt(fname,usecols=(2,3,4),names='x,y,z')
>> this command works and does not return any warnings or errors, but returns an numpy array with no field names. If you use:
>> data = np.genfromtxt(fname,usecols=(2,3,4),dtype=None,names='x,y,z')
>> then the command does what I expect it to and returns a structured numpy array with field names. So essentially, the 'names' argument doesn't not work unless you also specify the 'dtype' argument.
What did you actually expect?
It would be very informative if you could provide a simple example of
this for testing.
There are many combinations of arguments so not all have been tested and
it is not always clear what the expected behavior should be.
>> I think, it would be less confusing to new users to either have this explicitly mentioned in the documentation string for the genfromtxt 'names' argument or to have the function default to 'dtype=None' if the 'names' argument is specified without specifying the 'dtype' argument.
>> - dharhas
> I came across this behavior recently and agree with you. There is a
> patch in the works for this.
> See this thread: http://thread.gmane.org/gmane.comp.python.numeric.general/33479
> And this ticket: http://projects.scipy.org/numpy/ticket/1252
From the numpy help, there is this example:
data = np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),
It does not help that the dtype of structured arrays also includes the
actual name. So I do not think we can use dtype argument without using
the combination of dtype and name. Perhaps if dtype is split into names
and formats so that dtype=('name', 'format').
In some sense you are suggesting that we should have something like:
Ignore the use of None and True for dtype and names arguments:
i) If only dtype is only specified then use the specified dtype and add
default names such as col1, col2,... if necessary
ii) If names is only specified then contruct the dtype as ('name',
iii) If formats is only specified then construct the dtype as ('default
iv) If only names and formats are only specified then construct the
dtype as ('name', 'format')
v) If no dtype, names and formats are only specified then construct the
dtype as ('default name', 'default format')
vi) If dtype and names or formats are specified then use dtype if it is
of the form ('name', 'format') or use one of the previous cases.
When dtype is None this implies format is None so the format is obtained
from the data. If names is not True then the names are either from the
argument or default values.
If names argument is True then the names should be read from the data
and one of the previous cases apply.
More information about the SciPy-User