[Numpy-discussion] Mean of n values within an array

Phil Ruggera pruggera at gmail.com
Fri Aug 4 10:12:51 CDT 2006


The spook is in t = [1.3*i for i in range(1400)].  It used to be t =
[1.0*i for i in range(1400)] but I changed it to shake out algorithms
that produce differences.  But a max difference of 2.077e-16 is
immaterial for my application.   I should use a less strict compare.

On 8/3/06, Charles R Harris <charlesr.harris at gmail.com> wrote:
> Hi Phil,
>
> Curious. It works fine here in the original form.  I even expected a tiny
> difference because of floating point voodoo but there was none at all. Now
> if I copy your program and run it there *is* a small difference over the
> slice [1:] (to avoid division by zero).
>
> index of max fractional difference: 234
> max fractional difference: 2.077e-16
> reg at max fractional difference: 1.098e+03
>
> Which is just about roundoff error (1.11e-16) for double precision, so it
> lost a bit of precision.
>
> Still, I am not clear why the results should differ at all between the
> original and your new code. Cue spooky music.
>
> Chuck
>
> On 8/3/06, Phil Ruggera <pruggera at gmail.com> wrote:
> > Tweek2 is slightly faster, but does not produce the same result as the
> > regular python baseline:
> >
> > regular python took: 11.997997 sec.
> > numpy convolve took: 0.611996 sec.
> > numpy convolve tweek 1 took: 0.442029 sec.
> > numpy convolve tweek 2 took: 0.418857 sec.
> > Traceback (most recent call last):
> > File "G:\Python\Dev\mean.py", line 57, in ?
> > numpy.testing.assert_equal(reg, np3)
> > File
> "C:\Python24\Lib\site-packages\numpy\testing\utils.py",
> line
> > 130, in assert_equal
> > return assert_array_equal(actual, desired, err_msg)
> > File
> "C:\Python24\Lib\site-packages\numpy\testing\utils.py",
> line
> > 217, in assert_array_equal
> > assert cond,\
> > AssertionError:
> > Arrays are not equal (mismatch 17.1428571429%):
> > Array 1: [ 0.0000000000000000e+00 6.5000000000000002e-01
> 1.3000000000000000e+00
> > ..., 1.7842500000000002e+03 1.785550000...
> > Array 2: [ 0.0000000000000000e+00 6.5000000000000002e-01
> 1.3000000000000000e+00
> > ..., 1.7842500000000002e+03 1.785550000...
>
>
>
> > Code:
> >
> > # mean of n values within an array
> > import numpy, time
> > def nmean(list,n):
> >    a = []
> >    for i in range(1,len(list)+1):
> >        start = i-n
> >        divisor = n
> >        if start < 0:
> >            start = 0
> >            divisor = i
> >        a.append(sum(list[start:i])/divisor)
> >    return a
> >
> > def testNP(code, text):
> >     start = time.clock()
> >     for x in range(1000):
> >         np = code(t,50)
> >     print text, "took: %f sec."%( time.clock() - start)
> >     return np
> >
> > t = [1.3*i for i in range(1400)]
> > reg = testNP(nmean, 'regular python')
> >
> > t = numpy.array(t,dtype=float)
> >
> > def numpy_nmean_conv(list,n):
> >     b = numpy.ones(n,dtype=float)
> >     a = numpy.convolve(list,b,mode="full")
> >     for i in range(n):
> >         a[i] /= i + 1
> >     a[n:] /= n
> >     return a[:len(list)]
> >
> > np1 = testNP(numpy_nmean_conv, 'numpy convolve')
> >
> > def numpy_nmean_conv_nl_tweak1(list,n):
> >    b = numpy.ones(n,dtype=float)
> >    a = numpy.convolve(list,b,mode="full")
> >    a[:n] /= numpy.arange(1, n+1)
> >    a[n:] /= n
> >    return a[:len(list)]
> >
> > np2 = testNP(numpy_nmean_conv_nl_tweak1, 'numpy convolve
> tweek 1')
> >
> > def numpy_nmean_conv_nl_tweak2(list,n):
> >
> >    b = numpy.ones(n,dtype=float)
> >    a = numpy.convolve(list,b,mode="full")
> >    a[:n] /= numpy.arange(1, n + 1)
> >    a[n:] *= 1.0/n
> >    return a[:len(list)]
> >
> > np3 = testNP(numpy_nmean_conv_nl_tweak2, 'numpy convolve
> tweek 2')
> >
> > numpy.testing.assert_equal(reg, np1)
> > numpy.testing.assert_equal(reg, np2)
> > numpy.testing.assert_equal(reg, np3)
> >
> > On 8/3/06, Charles R Harris < charlesr.harris at gmail.com> wrote:
> > > Hi Scott,
> > >
> > >
> > > On 8/3/06, Scott Ransom <sransom at nrao.edu> wrote:
> > > > You should be able to modify the kernel so that you can avoid
> > > > many of the divides at the end.  Something like:
> > > >
> > > > def numpy_nmean_conv_nl2(list,n):
> > > >     b = numpy.ones (n,dtype=float) / n
> > > >     a = numpy.convolve (c,b,mode="full")
> > > >     # Note: something magic in here to fix the first 'n' values
> > > >     return a[:len(list)]
> > >
> > >
> > > Yep, I tried that but it wasn't any faster. It might help for really
> *big*
> > > arrays. The first n-1 values still need to be fixed after.
> > >
> > > Chuck
> > >
> > > > I played with it a bit, but don't have time to figure out exactly
> > > > how convolve is mangling the first n return values...
> > > >
> > > > Scott
> > > >
> > > >
> > > >
> > > > On Thu, Aug 03, 2006 at 09:38:25AM -0600, Charles R Harris wrote:
> > > > > Heh,
> > > > >
> > > > > This is fun. Two more variations with 1000 reps instead of 100 for
> > > better
> > > > > timing:
> > > > >
> > > > > def numpy_nmean_conv_nl_tweak1(list,n):
> > > > >   b = numpy.ones(n,dtype=float)
> > > > >   a = numpy.convolve(list,b,mode="full")
> > > > >   a[:n] /= numpy.arange(1, n + 1)
> > > > >   a[n:] /= n
> > > > >   return a[:len(list)]
> > > > >
> > > > > def numpy_nmean_conv_nl_tweak2(list,n):
> > > > >   b = numpy.ones(n,dtype=float)
> > > > >   a = numpy.convolve(list,b,mode="full")
> > > > >   a[:n] /= numpy.arange(1, n + 1)
> > > > >   a[n:] *= 1.0/n
> > > > >   return a[:len(list)]
> > > > >
> > > > > Which gives
> > > > >
> > > > > numpy convolve took: 2.630000 sec.
> > > > > numpy convolve noloop took: 0.320000 sec.
> > > > > numpy convolve noloop tweak1 took: 0.250000 sec.
> > > > > numpy convolve noloop tweak2 took: 0.240000 sec.
> > > > >
> > > > > Chuck
> > > > >
> > > > > On 8/2/06, Phil Ruggera < pruggera at gmail.com> wrote:
> > > > > >
> > > > > >A variation of the proposed convolve routine is very fast:
> > > > > >
> > > > > >regular python took: 1.150214 sec.
> > > > > >numpy mean slice took: 2.427513 sec.
> > > > > >numpy convolve took: 0.546854 sec.
> > > > > >numpy convolve noloop took: 0.058611 sec.
> > > > > >
> > > > > >Code:
> > > > > >
> > > > > ># mean of n values within an array
> > > > > >import numpy, time
> > > > > >def nmean(list,n):
> > > > > >    a = []
> > > > > >    for i in range(1,len(list)+1):
> > > > > >        start = i-n
> > > > > >        divisor = n
> > > > > >        if start < 0:
> > > > > >            start = 0
> > > > > >            divisor = i
> > > > > >        a.append(sum(list[start:i])/divisor)
> > > > > >    return a
> > > > > >
> > > > > >t = [1.0*i for i in range(1400)]
> > > > > >start = time.clock ()
> > > > > >for x in range(100):
> > > > > >    reg = nmean(t,50)
> > > > > >print "regular python took: %f sec."%(time.clock() - start)
> > > > > >
> > > > > >def numpy_nmean(list,n):
> > > > > >    a = numpy.empty(len(list),dtype=float)
> > > > > >    for i in range(1,len(list)+1):
> > > > > >        start = i-n
> > > > > >        if start < 0:
> > > > > >            start = 0
> > > > > >        a[i-1] = list[start:i].mean(0)
> > > > > >    return a
> > > > > >
> > > > > >t = numpy.arange (0,1400,dtype=float)
> > > > > >start = time.clock()
> > > > > >for x in range(100):
> > > > > >    npm = numpy_nmean(t,50)
> > > > > >print "numpy mean slice took: %f sec."%(time.clock() - start)
> > > > > >
> > > > > >def numpy_nmean_conv(list,n):
> > > > > >    b = numpy.ones(n,dtype=float)
> > > > > >    a = numpy.convolve(list,b,mode="full")
> > > > > >    for i in range(0,len(list)):
> > > > > >        if i < n :
> > > > > >            a[i] /= i + 1
> > > > > >        else :
> > > > > >            a[i] /= n
> > > > > >    return a[:len(list)]
> > > > > >
> > > > > >t = numpy.arange(0,1400,dtype=float)
> > > > > >start = time.clock ()
> > > > > >for x in range(100):
> > > > > >    npc = numpy_nmean_conv(t,50)
> > > > > >print "numpy convolve took: %f sec."%( time.clock() - start)
> > > > > >
> > > > > >def numpy_nmean_conv_nl(list,n):
> > > > > >    b = numpy.ones(n,dtype=float)
> > > > > >    a = numpy.convolve(list,b,mode="full")
> > > > > >    for i in range(n):
> > > > > >        a[i] /= i + 1
> > > > > >    a[n:] /= n
> > > > > >    return a[:len(list)]
> > > > > >
> > > > > >t = numpy.arange(0,1400,dtype=float)
> > > > > >start = time.clock()
> > > > > >for x in range(100):
> > > > > >    npn = numpy_nmean_conv_nl(t,50)
> > > > > >print "numpy convolve noloop took: %f sec."%( time.clock() - start)
> > > > > >
> > > > > >numpy.testing.assert_equal(reg,npm)
> > > > > >numpy.testing.assert_equal(reg,npc)
> > > > > >numpy.testing.assert_equal(reg,npn)
> > > > > >
> > > > > >On 7/29/06, David Grant < davidgrant at gmail.com> wrote:
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On 7/29/06, Charles R Harris <charlesr.harris at gmail.com > wrote:
> > > > > >> >
> > > > > >> > Hmmm,
> > > > > >> >
> > > > > >> > I rewrote the subroutine a bit.
> > > > > >> >
> > > > > >> >
> > > > > >> > def numpy_nmean(list,n):
> > > > > >> >     a = numpy.empty(len(list),dtype=float)
> > > > > >> >
> > > > > >> >     b = numpy.cumsum(list)
> > > > > >> >     for i in range(0,len(list)):
> > > > > >> >         if i < n :
> > > > > >> >             a[i] = b[i]/(i+1)
> > > > > >> >         else :
> > > > > >> >             a[i] = (b[i] - b[i-n])/(i+1)
> > > > > >> >     return a
> > > > > >> >
> > > > > >> > and got
> > > > > >> >
> > > > > >> > regular python took: 0.750000 sec.
> > > > > >> > numpy took: 0.380000 sec.
> > > > > >>
> > > > > >>
> > > > > >> I got rid of the for loop entirely. Usually this is the thing to
> do,
> > > at
> > > > > >> least this will always give speedups in Matlab and also in my
> limited
> > > > > >> experience with Numpy/Numeric:
> > > > > >>
> > > > > >>  def numpy_nmean2(list,n):
> > > > > >>
> > > > > >>     a = numpy.empty(len(list),dtype=float)
> > > > > >>     b = numpy.cumsum(list)
> > > > > >>      c = concatenate((b[n:],b[:n]))
> > > > > >>     a[:n] = b[:n]/(i+1)
> > > > > >>      a[n:] = (b[n:] - c[n:])/(i+1)
> > > > > >>     return a
> > > > > >>
> > > > > >> I got no noticeable speedup from doing this which I thought was
> > > pretty
> > > > > >> amazing. I even profiled all the functions, the original, the one
> > > > > >written by
> > > > > >> Charles, and mine, using hotspot just to make sure nothing funny
> was
> > > > > >going
> > > > > >> on. I guess plain old Python can be better than you'd expect in
> > > certain
> > > > > >> situtations.
> > > > > >>
> > > > > >> --
> > > > > >> David Grant
> > > > > >
> > > > >
> > >
> >-------------------------------------------------------------------------
> > > > > >Take Surveys. Earn Cash. Influence the Future of IT
> > > > > >Join SourceForge.net's Techsay panel and you'll get the chance to
> share
> > > > > >your
> > > > > >opinions on IT & business topics through brief surveys -- and earn
> cash
> > > > > >
> > >
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> > > > > >_______________________________________________
> > > > > >Numpy-discussion mailing list
> > > > > > Numpy-discussion at lists.sourceforge.net
> > > > >
> > > >
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> > > > > >
> > > >
> > > > >
> > >
> -------------------------------------------------------------------------
> > > > > Take Surveys. Earn Cash. Influence the Future of IT
> > > > > Join SourceForge.net's Techsay panel and you'll get the chance to
> share
> > > your
> > > > > opinions on IT & business topics through brief surveys -- and earn
> cash
> > > > >
> > >
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> > > > > _______________________________________________
> > > > > Numpy-discussion mailing list
> > > > > Numpy-discussion at lists.sourceforge.net
> > > > >
> > >
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> > > >
> > > >
> > > > --
> > > > --
> > > > Scott M. Ransom            Address:  NRAO
> > > > Phone:  (434) 296-0320               520 Edgemont Rd.
> > > > email:   sransom at nrao.edu             Charlottesville, VA 22903 USA
> > > > GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989
> >
> >
> -------------------------------------------------------------------------
> > Take Surveys. Earn Cash. Influence the Future of IT
> > Join SourceForge.net's Techsay panel and you'll get the chance to share
> your
> > opinions on IT & business topics through brief surveys -- and earn cash
> >
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> >
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>




More information about the Numpy-discussion mailing list