[SciPy-Dev] np.savetxt: apply patch in enhancement ticket 1079 to add headers?
Thu Jun 3 10:06:40 CDT 2010
On Wed, Jun 2, 2010 at 1:14 PM, Stefan <email@example.com> wrote:
>> Not that I am complaining rather trying to understand what is expected
>> to happen.
>> Under the patch, it is very much user beware. The header argument can
>> be anything or nothing. There is no check for the contents or if the
>> delimiter used is the same as the rest of the output. Further with the
>> newline option there is no guarantee that the lines in the header will
>> have the same line endings throughout the file.
>> So what should a user be allowed to use as a header?
>> You could write a whole program there or an explanation of the
>> following output - which is very appealing. You could force a list of
>> strings so that you print out newline.join(header) - okay not quite
>> because it should include the comment argument.
>> Should savetxt be restricted to something that loadtxt can read?
>> This is potentially problematic if you want a header line. Although it
>> could return the number of header lines.
>> [savetxt should also be updated to allow bz2 as loadtxt handles those
>> now - not that I have used it]
>> Also note that since that patch was written, savetxt takes a user
>> supplied newline keyword, so you can just append that to the header
>> True, we were not aware of this, but this does not help much for the
>> Entered as ~3 months ago:http://projects.scipy.org/numpy/changeset/8180
>> Should this be forced to check for valid options for new lines?
>> Otherwise you from this 'np.savetxt('junk.text', [1,2,3,4,5],
>> newline='what')' you get:
>> Which is not going to be read back by loadtxt.
>> As numpy.loadtxt has a default comment character ('#'), the same may be
>> implemented for numpy.savetxt. In this case, numpy.savetxt would get two
>> additional keywords (e.g. header, comment(character)), which bloats the
>> interface, but potentially provides more safety.
>> FWIW, I ended up rolling my own using the most recent pre-Python 3
>> changes for savetxt that accepts a list of names instead of one string
>> or if the provided array has the attribute dtype.names (non-nested rec
>> or structured arrays) it uses those. Whatever is done I think the
>> support for structured arrays is nice, and I think having this
>> functionality is a no-brainer. I need it quite often.
>> Although, we have not been using record arrays too often, we see their
>> advantages and agree that it should be possible to use them as you described
>> We also thought about a solution, using the __str__ method for the 'header
>> object'. In this vain, an arbitrary header class (including a plane string)
>> providing an __str__ member may be handed to numpy.savetxt,
>> which can use it to write the header.
> So let us briefly summarize whats on the table. It appears to us that
> there are basically three open issues:
> (1) a csv like header for savetxt written files (first line contains column
> (2) comments (introduced by comment character e.g. '#') at the beginning
> of the file (preceding the data)
> (3) the role of the 'newline' option
> As was noted, the patch (ticket 1079) enables both to write a csv like
> header (1) and comment line(s) introduced by a comment character (e.g. '#').
> Nonetheless, this solution is quite unsatisfactory
> in our opinion, because it may be error prone,
> as the user is in charge of the entire formatting. Despite this, we think
> that it should be up to the user what amount of information is to be put
> at the top of the file, but the format should be checked as far as possible.
> Using either a string or a list/tuple of strings, as proposed by Bruce,
> seems to be a reasonable possibility to implement the desired functionality.
> Maybe two individual keywords ('header' and 'comment') should exist to
> distinguish whether the the user requests case (1) or (2). As for loadtxt
> the default comment character should be '#', but it may be changed by the
> We think that savetxt should not be restricted to output, which can be read
> by loadtxt. Although it should be possible to add commments to the output
> file, so that it remains readable by loadtxt (without tweaking it
> e.g. with the skiprows keyword).
Thanks. This does clear up my confusion and I think having both a
header and a comments keyword makes sense. For the form, as I said, I
went with a list of strings, as I encounter this more often than one
string, but in the end it's all the same to me.
Glad this is getting some attention.
> We agree that the newline keyword may cause inconsistencies in the file
> (if ticket 1079 were applied),
> and possibly strange behavior such as when newline='what' is specified.
> Yet, this question does not only concern the header/comments.
> Stefan & Christian
> SciPy-Dev mailing list
More information about the SciPy-Dev