[Numpy-discussion] fast access and normalizing of ndarray slices

eat e.antero.tammi@gmail....
Mon Jun 4 09:13:43 CDT 2012


Hi,

On Mon, Jun 4, 2012 at 12:44 AM, srean <srean.list@gmail.com> wrote:

> Hi Wolfgang,
>
>  I think you are looking for reduceat( ), in particular add.reduceat()
>

Indeed OP could utilize add.reduceat(...), like:

# tst.py

import numpy as np


 def reduce(data, lengths):

    ind, ends= np.r_[lengths, lengths], lengths.cumsum()

    ind[::2], ind[1::2]= ends- lengths, ends

    return np.add.reduceat(np.r_[data, 0], ind)[::2]


 def normalize(data, lengths):

    return data/ np.repeat(reduce(data, lengths), lengths)


 def gen(par):

    lengths= np.random.randint(*par)

    return np.random.randn(lengths.sum()), lengths


 if __name__ == '__main__':

    data= np.array([1, 2, 1, 2, 3, 4, 1, 2, 3], dtype= float)

    lengths= np.array([2, 4, 3])

    print reduce(data, lengths)

    print normalize(data, lengths).round(2)


Resulting:
In []: %run tst
[  3.  10.   6.]
[ 0.33  0.67  0.1   0.2   0.3   0.4   0.17  0.33  0.5 ]

Fast enough:
In []: data, lengths= gen([5, 15, 5e4])
In []: data.size
Out[]: 476028
In []: %timeit normalize(data, lengths)
10 loops, best of 3: 29.4 ms per loop


My 2 cents,
-eat

>
> -- srean
>
> On Thu, May 31, 2012 at 12:36 AM, Wolfgang Kerzendorf
> <wkerzendorf@gmail.com> wrote:
> > Dear all,
> >
> > I have an ndarray which consists of many arrays stacked behind each
> other (only conceptually, in truth it's a normal 1d float64 array).
> > I have a second array which tells me the start of the individual data
> sets in the 1d float64 array and another one which tells me the length.
> > Example:
> >
> > data_array = (conceptually) [[1,2], [1,2,3,4], [1,2,3]] = in reality
> [1,2,1,2,3,4,1,2,3, dtype=float64]
> > start_pointer = [0, 2, 6]
> > length_data = [2, 4, 3]
> >
> > I now want to normalize each of the individual data sets. I wrote a
> simple for loop over the start_pointer and length data grabbed the data and
> normalized it and wrote it back to the big array. That's slow. Is there an
> elegant numpy way to do that? Do I have to go the cython way?
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120604/3df11d07/attachment.html 


More information about the NumPy-Discussion mailing list