# [Numpy-discussion] algorithm, optimization, or other problem?

Bruce Southey bsouthey at gmail.com
Tue Feb 21 07:16:07 CST 2006

```Hi,
In the current version, note that Y is scalar so replace the squaring
(Y**2) with Y*Y as you do in the dohebb function.  On my system
without blas etc removing the squaring removes a few seconds (16.28 to
12.4). It did not seem to help factorizing Y.

Also, eta and tau are constants so define them only once as scalars
outside the loops and do the division outside the loop. It only saves
about 0.2 seconds but these add up.

The inner loop probably can be vectorized because it is just vector
operations on a matrix. You are just computing over the ith dimension
of X.  I think that you could be able to find the matrix version on
the net.

Regards
Bruce

On 2/21/06, Brian Blais <bblais at bryant.edu> wrote:
> Hello,
>
> I am trying to translate some Matlab/mex code to Python, for doing neural
> simulations.  This application is definitely computing-time limited, and I need to
> optimize at least one inner loop of the code, or perhaps even rethink the algorithm.
>    The procedure is very simple, after initializing any variables:
>
> 1) select a random input vector, which I will call "x".  right now I have it as an
> array, and I choose columns from that array randomly.  in other cases, I may need to
> take an image, select a patch, and then make that a column vector.
>
> 2) calculate an output value, which is the dot product of the "x" and a weight
> vector, "w", so
>
>         y=dot(x,w)
>
> 3) modify the weight vector based on a matrix equation, like:
>
>         w=w+ eta * (y*x - y**2*w)
>                ^
>                |
>                +---- learning rate constant
>
> 4) repeat steps 1-3 many times
>
> I've organized it like:
>
> for e in 100:   # outer loop
>      for i in 1000:  # inner loop
>          (steps 1-3)
>
>      display things.
>
> so that the bulk of the computation is in the inner loop, and is amenable to
> converting to a faster language.  This is my issue:
>
> straight python, in the example posted below for 250000 inner-loop steps, takes 20
> seconds for each outer-loop step.  I tried Pyrex, which should work very fast on such
> a problem, takes about 8.5 seconds per outer-loop step.  The same code as a C-mex
> file in matlab takes 1.5 seconds per outer-loop step.
>
> Given the huge difference between the Pyrex and the Mex, I feel that there is
> something I am doing wrong, because the C-code for both should run comparably.
> Perhaps the approach is wrong?  I'm willing to take any suggestions!  I don't mind
> coding some in C, but the Python API seemed a bit challenging to me.
>
> One note: I am using the Numeric package, not numpy, only because I want to be able
> to use the Enthought version for Windows.  I develop on Linux, and haven't had a
> chance to see if I can compile numpy using the Enthought Python for Windows.
>
> If there is anything else anyone needs to know, I'll post it.  I put the main script,
> and a dohebb.pyx code below.
>
>
>                         thanks!
>
>                                         Brian Blais
>
> --
> -----------------
>
>              bblais at bryant.edu
>              http://web.bryant.edu/~bblais
>
>
>
>
> # Main script:
>
> from dohebb import *
> import pylab as p
> from Numeric import *
> from RandomArray import *
> import time
>
> x=random((100,1000))    # 1000 input vectors
>
> numpats=x.shape[0]
> w=random((numpats,1));
>
> th=random((1,1))
>
> params={}
> params['eta']=0.001;
> params['tau']=100.0;
> old_mx=0;
> for e in range(100):
>
>      rnd=randint(0,numpats,250000)
>      t1=time.time()
>      if 0:  # straight python
>          for i in range(len(rnd)):
>              pat=rnd[i]
>              xx=reshape(x[:,pat],(1,-1))
>              y=matrixmultiply(xx,w)
>              w=w+params['eta']*(y*transpose(xx)-y**2*w);
>              th=th+(1.0/params['tau'])*(y**2-th);
>      else: # pyrex
>          dohebb(params,w,th,x,rnd)
>      print time.time()-t1
>
>
> p.plot(w,'o-')
> p.xlabel('weights')
> p.show()
>
>
> #=============================================
>
> #         dohebb.pyx
>
> cdef extern from "Numeric/arrayobject.h":
>
>    struct PyArray_Descr:
>      int type_num, elsize
>      char type
>
>    ctypedef class Numeric.ArrayType [object PyArrayObject]:
>      cdef char *data
>      cdef int nd
>      cdef int *dimensions, *strides
>      cdef object base
>      cdef PyArray_Descr *descr
>      cdef int flags
>
>
> def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd):
>
>
>      cdef int num_iterations
>      cdef int num_inputs
>      cdef int offset
>      cdef double *wp,*xp,*thp
>      cdef int *rndp
>      cdef double eta,tau
>
>      eta=params['eta']  # learning rate
>      tau=params['tau']  # used for variance estimate
>
>      cdef double y
>      num_iterations=rnd.dimensions[0]
>      num_inputs=w.dimensions[0]
>
>      # get the pointers
>      wp=<double *>w.data
>      xp=<double *>X.data
>      rndp=<int *>rnd.data
>      thp=<double *>th.data
>
>      for it from 0 <= it < num_iterations:
>
>          offset=rndp[it]*num_inputs
>
>          # calculate the output
>          y=0.0
>          for i from 0 <= i < num_inputs:
>              y=y+wp[i]*xp[i+offset]
>
>          # change in the weights
>          for i from 0 <= i < num_inputs:
>              wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i])
>
>          # estimate the variance
>          thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0])
>
>
>
>
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>

```