[Numpy-discussion] Simplifying array()

Tim Hochberg tim.hochberg at cox.net
Thu Jan 13 09:01:15 CST 2005

Colin J. Williams wrote:

> Todd Miller wrote:
>> Someone (way to go Rory!) recently posted a patch (woohoo!) for
>> numarray which I think bears a little discussion since it involves
>> the re-write of a fundamental numarray function: array().
>> The patch fixes a number of bugs and deconvolutes the logic of array().
>> The patch is here if you want to look at it yourself:
>> http://sourceforge.net/tracker/?atid=450449&group_id=1369&func=browse
>> One item I thought needed some discussion was the removal of two
>> features:
>>>  * array() does too much. E.g., handling file/memory instances for
>>>    'sequence'. There's fromfile for the former, and users needing
>>>    the latter functionality should be clued up enough to
>>>    instantiate NumArray directly.
>> I agree with this myself.  Does anyone care if they will no longer be
>> able to construct an array from a file or buffer object using array()
>> rather than fromfile() or NumArray(), respectively?  Is a deprecation
>> process necessary to remove them?  
This isn't going to cause me pain, FWIW.

> I would suggest deprecation on the way to removal.  For the newcomer, 
> who is not yet "clued up"
> some advice on the instantiation of NumArray would help.  Currently, 
> neither the word "class" or
> "NumArray" appear in the doc index.
> Rory leaves in type and typecode.  It would be good to eliminate this 
> apparent overlap.  Why not
> deprecate and then drop type?  As a compromise, either could be 
> accepted as a NumArray.__init__
> argument, since it is easy to distinguish between them.

I thought typecode was eventually going away, not type. Either way, it 
makes sense to drop one of them
eventually. This should definately go through a period of deprecation 
thought: it will certainly require that I
fix a bunch of my code.

> It would be good to clarify the acceptable content of a sequence.  A 
> list, perhaps with sublists, of
> numbers is clear enough but what about a sequence of NumArray 
> instances or even a sequence
> of numbers, mixed with NumArray instances?

Isn't any sequence that is composed of numbers or subsequences 
acceptable, as long as it has a consistent shape (no ragged edges)?

> Is the function asarray redundant?

No, the copy=False parameter is redundant ;) Well as a pair they are 
redundant, but if I was going to get rid of something, I'd get rid of 
copy,  because it's lying: copy=False sometimes copies (when the 
sequence is not an array) and sometimes does not (when the sequence is 
an array). A better name would be alwaysCopy, but better still would be 
to just get rid of it altogether and rely on asarray. (asarray may be 
implemented using the copy parameter now, but that would be easy to fix.).

While we're at it, savespace should get nuked too (all with appropriate 
deprecations I suppose), so the final signature of array would be:

array(sequence=None, type=None, shape=None)

Hmm. That's still too complicated. It really should be

array(sequence, type=None)

I believe that other uses can be more clearly accomplished using zeros 
and reshape.

Of course that has drastic backward compatibility issues and even with 
generous usage of deprecations might not help the transition much. 
Still, that's probably what I'd shoot for if it were an option.

> I suggest that the copy parameter be of the BoolType.  This probably 
> has no practical impact but
> it is consistent with current Python usage and makes it clear that 
> this is a Yes/No parameter,
> rather than specifying a number of copies.
>> I think strings.py and records.py also have "over-stuffed" array()
>> functions...  so consistency bids us to streamline those as well. 
>> Regards,
>> Todd
> Thanks to Rory for initiating this.



More information about the Numpy-discussion mailing list