[Numpy-discussion] ragged array implimentation
Christopher Barker
Chris.Barker@noaa....
Thu Mar 10 11:05:11 CST 2011
On 3/7/11 5:51 PM, Sturla Molden wrote:
> Den 07.03.2011 18:28, skrev Christopher Barker:
>> 1, 2, 3, 4
>> 5, 6
>> 7, 8, 9, 10, 11, 12
>> 13, 14, 15
>> ...
>>
> A ragged array, as implemented in C++, Java or C# is just an array of
> arrays (or 'a pointer to an array of pointers').
Sure, but as a rule I don't find direct translation of C++ or Java code
to Pyton the best approach ;-)
> Basically, that is an
> ndarray of ndarrays (or a list of ndarrays, whatever you prefer).
>
> >>> ra = np.zeros(4, dtype=np.ndarray)
> >>> ra[0] = np.array([1,2,3,4])
> >>> ra[1] = np.array([5,6])
> >>> ra[2] = np.array([7,8,9,10,11,12])
> >>> ra[3] = np.array([13,14,15])
> >>> ra
> array([[1 2 3 4], [5 6], [ 7 8 9 10 11 12], [13 14 15]], dtype=object)
> >>> ra[1][1]
> 6
> >>> ra[2][:]
> array([ 7, 8, 9, 10, 11, 12])
yup -- or I could use a list to store the rows, which would add the
ability to append rows.
> Slicing in two dimensions does not work as some might expect:
>
> >>> ra[:2][:2]
> array([[1 2 3 4], [5 6]], dtype=object)
yup -- might want to overload indexing to do something smarter about
that, though in myuse-case, slicing "vertically" isn't really useful
anyway -- the nth element in one row doesn't neccessarily have anything
to do with the nth element in another row.
However, asside from the slicing syntax issue, what I lose with the
approach is the ability to get reasonable performance on operations on
the entire array:
ra *= 3.3
I"d like that to be numpy-efficient.
What I need to grapple with is:
1) Is there a point to trying to build a general purpose ragged array?
Or should I jsut build something that satisfies my use-case at hand?
2) What's the balance I need between performance and flexibility?
putting the rows in a list give a lot more flexibility, putting it all
in one 1-d numpy array could give better performance.
NOTE: this looks like it could use a "growable" numpy array, much like
one I've written before -- maybe it's time to revive that project and
use it here, fixing some performance issues while I'm at it.
Thanks for all your ideas,
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
More information about the NumPy-Discussion
mailing list