[SciPy-user] algorithm, optimization, or other problem?

Brian Blais bblais at bryant.edu
Tue Feb 21 06:24:01 CST 2006

```Hello,

I am trying to translate some Matlab/mex code to Python, for doing neural
simulations.  This application is definitely computing-time limited, and I need to
optimize at least one inner loop of the code, or perhaps even rethink the algorithm.
The procedure is very simple, after initializing any variables:

1) select a random input vector, which I will call "x".  right now I have it as an
array, and I choose columns from that array randomly.  in other cases, I may need to
take an image, select a patch, and then make that a column vector.

2) calculate an output value, which is the dot product of the "x" and a weight
vector, "w", so

y=dot(x,w)

3) modify the weight vector based on a matrix equation, like:

w=w+ eta * (y*x - y**2*w)
^
|
+---- learning rate constant

4) repeat steps 1-3 many times

I've organized it like:

for e in 100:   # outer loop
for i in 1000:  # inner loop
(steps 1-3)

display things.

so that the bulk of the computation is in the inner loop, and is amenable to
converting to a faster language.  This is my issue:

straight python, in the example posted below for 250000 inner-loop steps, takes 20
seconds for each outer-loop step.  I tried Pyrex, which should work very fast on such
a problem, takes about 8.5 seconds per outer-loop step.  The same code as a C-mex
file in matlab takes 1.5 seconds per outer-loop step.

Given the huge difference between the Pyrex and the Mex, I feel that there is
something I am doing wrong, because the C-code for both should run comparably.
Perhaps the approach is wrong?  I'm willing to take any suggestions!  I don't mind
coding some in C, but the Python API seemed a bit challenging to me.

One note: I am using the Numeric package, not numpy, only because I want to be able
to use the Enthought version for Windows.  I develop on Linux, and haven't had a
chance to see if I can compile numpy using the Enthought Python for Windows.

If there is anything else anyone needs to know, I'll post it.  I put the main script,
and a dohebb.pyx code below.

thanks!

Brian Blais

--
-----------------

bblais at bryant.edu
http://web.bryant.edu/~bblais

# Main script:

from dohebb import *
import pylab as p
from Numeric import *
from RandomArray import *
import time

x=random((100,1000))    # 1000 input vectors

numpats=x.shape[0]
w=random((numpats,1));

th=random((1,1))

params={}
params['eta']=0.001;
params['tau']=100.0;
old_mx=0;
for e in range(100):

rnd=randint(0,numpats,250000)
t1=time.time()
if 0:  # straight python
for i in range(len(rnd)):
pat=rnd[i]
xx=reshape(x[:,pat],(1,-1))
y=matrixmultiply(xx,w)
w=w+params['eta']*(y*transpose(xx)-y**2*w);
th=th+(1.0/params['tau'])*(y**2-th);
else: # pyrex
dohebb(params,w,th,x,rnd)
print time.time()-t1

p.plot(w,'o-')
p.xlabel('weights')
p.show()

#=============================================

#         dohebb.pyx

cdef extern from "Numeric/arrayobject.h":

struct PyArray_Descr:
int type_num, elsize
char type

ctypedef class Numeric.ArrayType [object PyArrayObject]:
cdef char *data
cdef int nd
cdef int *dimensions, *strides
cdef object base
cdef PyArray_Descr *descr
cdef int flags

def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd):

cdef int num_iterations
cdef int num_inputs
cdef int offset
cdef double *wp,*xp,*thp
cdef int *rndp
cdef double eta,tau

eta=params['eta']  # learning rate
tau=params['tau']  # used for variance estimate

cdef double y
num_iterations=rnd.dimensions[0]
num_inputs=w.dimensions[0]

# get the pointers
wp=<double *>w.data
xp=<double *>X.data
rndp=<int *>rnd.data
thp=<double *>th.data

for it from 0 <= it < num_iterations:

offset=rndp[it]*num_inputs

# calculate the output
y=0.0
for i from 0 <= i < num_inputs:
y=y+wp[i]*xp[i+offset]

# change in the weights
for i from 0 <= i < num_inputs:
wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i])

# estimate the variance
thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0])

```