[Numpy-discussion] algorithm, optimization, or other problem?

Bruce Southey bsouthey at gmail.com
Wed Feb 22 06:25:05 CST 2006


Hi,
Actually it makes it slightly worse - given the responses on another
thread it is probably due to not pushing enough into C code.

Obviously use of blas etc will be faster but it doesn't change the
fact that removing the inner loop would be faster still.

Bruce

On 2/22/06, Nadav Horesh <nadavh at visionsense.com> wrote:
> You may get a significant boost by replacing the line:
>   w=w+ eta * (y*x - y**2*w)
> with
>   w *= 1.0 - eta*y*y
>   w += eta*y*x
>
> I ran a test on a similar expression and got 5 fold speed increase.
> The dot() function runs faster if you compile with dotblas.
>
>   Nadav.
>
>
> -----Original Message-----
> From:   numpy-discussion-admin at lists.sourceforge.net on behalf of Bruce Southey
> Sent:   Tue 21-Feb-06 17:15
> To:     Brian Blais
> Cc:     python-list at python.org; numpy-discussion at lists.sourceforge.net; scipy-user at scipy.net
> Subject:        Re: [Numpy-discussion] algorithm, optimization, or other problem?
> Hi,
> In the current version, note that Y is scalar so replace the squaring
> (Y**2) with Y*Y as you do in the dohebb function.  On my system
> without blas etc removing the squaring removes a few seconds (16.28 to
> 12.4). It did not seem to help factorizing Y.
>
> Also, eta and tau are constants so define them only once as scalars
> outside the loops and do the division outside the loop. It only saves
> about 0.2 seconds but these add up.
>
> The inner loop probably can be vectorized because it is just vector
> operations on a matrix. You are just computing over the ith dimension
> of X.  I think that you could be able to find the matrix version on
> the net.
>
> Regards
> Bruce
>
>
>
> On 2/21/06, Brian Blais <bblais at bryant.edu> wrote:
> > Hello,
> >
> > I am trying to translate some Matlab/mex code to Python, for doing neural
> > simulations.  This application is definitely computing-time limited, and I need to
> > optimize at least one inner loop of the code, or perhaps even rethink the algorithm.
> >    The procedure is very simple, after initializing any variables:
> >
> > 1) select a random input vector, which I will call "x".  right now I have it as an
> > array, and I choose columns from that array randomly.  in other cases, I may need to
> > take an image, select a patch, and then make that a column vector.
> >
> > 2) calculate an output value, which is the dot product of the "x" and a weight
> > vector, "w", so
> >
> >         y=dot(x,w)
> >
> > 3) modify the weight vector based on a matrix equation, like:
> >
> >         w=w+ eta * (y*x - y**2*w)
> >                ^
> >                |
> >                +---- learning rate constant
> >
> > 4) repeat steps 1-3 many times
> >
> > I've organized it like:
> >
> > for e in 100:   # outer loop
> >      for i in 1000:  # inner loop
> >          (steps 1-3)
> >
> >      display things.
> >
> > so that the bulk of the computation is in the inner loop, and is amenable to
> > converting to a faster language.  This is my issue:
> >
> > straight python, in the example posted below for 250000 inner-loop steps, takes 20
> > seconds for each outer-loop step.  I tried Pyrex, which should work very fast on such
> > a problem, takes about 8.5 seconds per outer-loop step.  The same code as a C-mex
> > file in matlab takes 1.5 seconds per outer-loop step.
> >
> > Given the huge difference between the Pyrex and the Mex, I feel that there is
> > something I am doing wrong, because the C-code for both should run comparably.
> > Perhaps the approach is wrong?  I'm willing to take any suggestions!  I don't mind
> > coding some in C, but the Python API seemed a bit challenging to me.
> >
> > One note: I am using the Numeric package, not numpy, only because I want to be able
> > to use the Enthought version for Windows.  I develop on Linux, and haven't had a
> > chance to see if I can compile numpy using the Enthought Python for Windows.
> >
> > If there is anything else anyone needs to know, I'll post it.  I put the main script,
> > and a dohebb.pyx code below.
> >
> >
> >                         thanks!
> >
> >                                         Brian Blais
> >
> > --
> > -----------------
> >
> >              bblais at bryant.edu
> >              http://web.bryant.edu/~bblais
> >
> >
> >
> >
> > # Main script:
> >
> > from dohebb import *
> > import pylab as p
> > from Numeric import *
> > from RandomArray import *
> > import time
> >
> > x=random((100,1000))    # 1000 input vectors
> >
> > numpats=x.shape[0]
> > w=random((numpats,1));
> >
> > th=random((1,1))
> >
> > params={}
> > params['eta']=0.001;
> > params['tau']=100.0;
> > old_mx=0;
> > for e in range(100):
> >
> >      rnd=randint(0,numpats,250000)
> >      t1=time.time()
> >      if 0:  # straight python
> >          for i in range(len(rnd)):
> >              pat=rnd[i]
> >              xx=reshape(x[:,pat],(1,-1))
> >              y=matrixmultiply(xx,w)
> >              w=w+params['eta']*(y*transpose(xx)-y**2*w);
> >              th=th+(1.0/params['tau'])*(y**2-th);
> >      else: # pyrex
> >          dohebb(params,w,th,x,rnd)
> >      print time.time()-t1
> >
> >
> > p.plot(w,'o-')
> > p.xlabel('weights')
> > p.show()
> >
> >
> > #=============================================
> >
> > #         dohebb.pyx
> >
> > cdef extern from "Numeric/arrayobject.h":
> >
> >    struct PyArray_Descr:
> >      int type_num, elsize
> >      char type
> >
> >    ctypedef class Numeric.ArrayType [object PyArrayObject]:
> >      cdef char *data
> >      cdef int nd
> >      cdef int *dimensions, *strides
> >      cdef object base
> >      cdef PyArray_Descr *descr
> >      cdef int flags
> >
> >
> > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd):
> >
> >
> >      cdef int num_iterations
> >      cdef int num_inputs
> >      cdef int offset
> >      cdef double *wp,*xp,*thp
> >      cdef int *rndp
> >      cdef double eta,tau
> >
> >      eta=params['eta']  # learning rate
> >      tau=params['tau']  # used for variance estimate
> >
> >      cdef double y
> >      num_iterations=rnd.dimensions[0]
> >      num_inputs=w.dimensions[0]
> >
> >      # get the pointers
> >      wp=<double *>w.data
> >      xp=<double *>X.data
> >      rndp=<int *>rnd.data
> >      thp=<double *>th.data
> >
> >      for it from 0 <= it < num_iterations:
> >
> >          offset=rndp[it]*num_inputs
> >
> >          # calculate the output
> >          y=0.0
> >          for i from 0 <= i < num_inputs:
> >              y=y+wp[i]*xp[i+offset]
> >
> >          # change in the weights
> >          for i from 0 <= i < num_inputs:
> >              wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i])
> >
> >          # estimate the variance
> >          thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0])
> >
> >
> >
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
>
>




More information about the Numpy-discussion mailing list