[SciPy-dev] chararray docstrings

Ralf Gommers ralf.gommers@googlemail....
Mon Oct 12 12:09:18 CDT 2009

On Mon, Oct 12, 2009 at 5:40 PM, Michael Droettboom <mdroe@stsci.edu> wrote:

> I was able to make my big chararray commit today.  If I understand
> correctly, I need to wait 24 hours for the doc editor to sync with SVN,
> and then I should mark all the chararray-related docstrings as "needs
> review".

Great! Thanks for all the work.

> The primary change to the docstrings is that all of the methods of the
> chararray class are now free functions.  These free functions represent
> the "primary" entry points, and thus have detailed documentation, and
> the chararray methods now have short "pointer" docstrings to the free
> functions.
> Where the docstring content itself has been updated, it is mainly to
> bring them closer to the Python standard library descriptions of these
> functions, which in most cases was more precise (since we are, in fact,
> calling the stdlib function under the hood) and concise (because the
> stdlib docs have been through a number of revisions and really get it
> right by now).
> All sounds very sensible.

> I do have a concern about one phrase that was used in a number of places
> that probably deserves some discussion:
> "The chararray module exists for backwards compatibility with Numarray,
> it is not recommended for new development. If one needs arrays of
> strings, use arrays of dtype
> <http://docs.scipy.org/numpy/docs/numpy.dtype/#dtype> object."
> There are many use cases (such as handling a binary structured format
> like FITS) where a dtype of 'string_' is more appropriate than a dtype
> of 'object_', and we shouldn't imply that all uses of chararray should
> now use object arrays.  Additionally, fast vectorized string operations
> will perform best on arrays of type 'string_' and 'unicode_', though
> 'object_' will work, it requires casting all objects to strings along
> the way, and could fail thousands of items in to an operation.  It's a
> "best tool for the job" judgment call, not a "one tool fits all".

I recently added that note (and mentioned that on the list) in three places
based on the email discussions some weeks ago. Now that a lot of issues are
solved and we are not considering deprecation anymore, this note can be
removed again.

Perhaps the above should read:
> "If one needs arrays of strings, use arrays of dtype
> <http://docs.scipy.org/numpy/docs/numpy.dtype/#dtype> string_ or
> unicode_.  If one needs arrays of variable-length strings, use arrays of
> dtype object_."

Sounds good.

It would still be nice to have a slightly longer (tutorial-style) discussion
somewhere in the docs. Or otherwise maybe a link to FITS docs that are
illustrative? I still have a hard time understanding why operating on
astronomical images with Python string methods is a good idea.


> Mike
> --
> Michael Droettboom
> Science Software Branch
> Operations and Engineering Division
> Space Telescope Science Institute
> Operated by AURA for NASA
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-dev/attachments/20091012/938a4234/attachment.html 

More information about the Scipy-dev mailing list