kxroberto at googlemail.com
Wed Nov 22 14:28:20 CST 2006
Robert Kern wrote:
> Tim Hochberg wrote:
>> Robert Kern wrote:
>>> One possibility is to check if the object is an ndarray (or subclass) and use
>>> .copy() if so; otherwise, use the current implementation and hope that you
>>> didn't pass it a Numeric or numarray array (or some other view-based object).
>> I think I would invert this test and instead check if the object is a
>> Python list and *not* copy in that case. Otherwise, use copy.copy to
>> copy the object whatever it is. This looks like it would be more robust
>> in that it would work in all sensible case, and just be a tad slower in
>> some of them.
> I don't want to assume that the only two sequence types are lists and arrays.
> The problem with using copy.copy() on non-arrays is that it, well, makes copies
> of the elements. The objects in the shuffled sequence are not the same objects
> before and after the shuffling. I consider that to be a violation of the spec.
> Views are rare outside of numpy/Numeric/numarray, partially because Guido
> considers them to be evil. I'm beginning to see why.
>> Another possible refinement / complication would be to special case 1D
>> arrays so that they run fastish.
>> A third possibility involves rewriting this in this form:
>> indices = arange(len(x))
>> _shuffle_core(indices) # This just does what current shuffle now does
>> x[:] = take(x, indices, 0)
> That's problematic since the elements all turn into numpy scalar objects:
> In : from numpy import *
> In : a = range(9,-1,-1)
> In : idx = arange(len(a))
> In : a[:] = take(a, idx, 0)
> In : a
> Out: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
> In : type(a)
> Out: <type 'numpy.int32'>
a[:]=take(asarray(a,object),idx,0) ? works also correct with ndarray's even if I didn't dig the reason why... all element will be probably re-casted twice.
Think the take-method on shuffled indizes is basically right and natural for a numpy-shuffler.
The example is just possibly another vote against the default behavior of letting numpy.scalar types out of arrays, which are set up with a "harmless" type.
array([ 1., 2., 3.])
is just ill as it I think.
In (Guido's) Python objects should probably come out of collections best as typy as they went in. Currently numpy-scalars will just "infect" the whole app almost like a virus (and kill performance and pickle's etc.)
Of course views are essential for an efficient array type, but type-altering possibly not.
For rare cases for generalized algs (I need to think hard to find even an example), where the array-interface is needed on elements (and a array(obj) cast is too uncomfortable), there could be still the different possibilty:
then its natural that numpy.float64, numpy.int32.... come out, as the programmer would even expect it so.
Thus maybe for array types:
* float!=numpy.float64 (but common base class (or 'float' itself) maybe)
* int !=numpy.intXX
* complex !=numpy.complex128
* default array type is (python.)float
* default array type from list of ints is (python.)int
* default array type from list of complex is (python.)complex
* default array type of other lists is always <object>
currently this is also problematic:
array(['1', '2', '3', ''],
>>> array([1,2,"3ef",'wefwfewoiwjefo iwjef'])
array(['1', '2', '3ef', 'wefwfewoiwjefo iwjef'],
>>> _='woeifjwo woie pwioef wliuefh lwieufh wleifuh welfiu '
array(['woeifjwo woie pwioef', '2', '3ef', 'wefwfewoiwjefo iwjef'],
is rarely what a Pythoneer would expect. Guess fix string arrays should only be created explicitely
More information about the Numpy-discussion