[SciPy-User] convertin from octave/ matlab to scipy

josef.pktd@gmai... josef.pktd@gmai...
Thu Feb 18 22:29:06 CST 2010

On Thu, Feb 18, 2010 at 10:56 PM,  <josef.pktd@gmail.com> wrote:
> On Thu, Feb 18, 2010 at 4:58 PM, eat <e.antero.tammi@gmail.com> wrote:
>> Hi,
>>
>> I'll just have started to study and learn the python/ scipy.
>> I have written some inhouse (exploratory) data analysis
>> octave/ matlab toolboxes/ packages that I now like to convert
>> to python/ scipy.
>>
>> I do have googled and skimmed bunch of documents, but still
>> I'll like to have experts opinion how to convert this (typical)
>> octave/ matlab indicies idiom to scipy:
>>
>>
>> """
>> a simple minded conversion of idiom 1 from octave(or matlab)
>> <ocode>
>> function t1
>>        m= 12; n= 123456; D= randn(m, n);
>>        f= @(X) sqrt(sum(X.^2)); g= @(D) zm(f, D);
>>        norm(f(D)- g(D))
>>
>> function D= zm(f, D)
>>        m= mean(D, 2); D= f(D- m(:, ones(1, size(D, 2))));
>> </ocode>
>> to scipy:
>> """
>>
>> from scipy import *
>> from numpy.linalg import norm #???
>>
>> def zm(f, D):
>>    a= vstack(range(D.shape[0]))* ([1]* D.shape[1]);
>>    m= mean(D, 1); return f(D- m[a])
>>
>> m= 12; n= 123456; D= random.randn(m, n);
>> f= lambda X: sqrt(sum(X**2, 0)); g= lambda D: zm(f, D)
>> print norm(f(D)- g(D))
>>
>>
>> I think my main question is:
>> do we have in scipy some 'under the hood' functionality, which
>> allows us not to repmat the constant vector (m)?
>
> broadcasting is one of the best features of numpy compared to matlab.
> Usually there are no repmats necessary, but often we have to add an
> axis to match up the shape of the arrays
>
>>>> import numpy as np
>>>> a = np.arange(6).reshape(2,-1)
>>>> a
> array([[0, 1, 2],
>       [3, 4, 5]])
>>>> a.mean()
> 2.5
>>>> a.mean(0)
> array([ 1.5,  2.5,  3.5])
>>>> a.mean(1)
> array([ 1.,  4.])
>
>>>> a - a.mean(0)
> array([[-1.5, -1.5, -1.5],
>       [ 1.5,  1.5,  1.5]])
>>>> a - a.mean(1)[:,np.newaxis]     #add axis back in that we lost in reduce operation mean, equivalent with [:,None]
> array([[-1.,  0.,  1.],
>       [-1.,  0.,  1.]])
>
> here are some explanations:
>
> In your code the line with a= vstack(range(D.shape[0]))* ([1]* D.shape[1]);
> is not necessary, but adding the newaxis if the mean is taken with
> respect to an axis > 0
> This should also make it quite a bit faster.
>
> I don't understand why you use anonymous function in matlab and lambda
> in python, but maybe that's because this is out of context. It's not
> easy to see what the actual flow of execution is, or maybe it's
> because I'm not used to having several statements on a line separated
> by semicolon.

I take this back, it's a bit difficult to read but I have a similar
pattern of code in matlab for nested functions.

maybe something like this (typed in editor, so it might not work)

def t1():
m, n = 12, 123456
D = random.randn(m, n)

def f(X):
return sqrt(sum(X**2, 0))

print norm(f(D) - zm(f, D))
#or ?
#g = lambda D: zm(f, D)
#print norm(f(D)- g(D))

def zm(f, D):
m = mean(D, 1)[:, None]
return f(D- m)

>
> Josef
>
>>
>> (Code like above is handled quite well in matlab where
>> (e(matlab)/ e(octave)<~ .1), compared to (e(scipy)/ e(octave)<~ .5),
>> where e(x) is some measure of the execution time in x.)
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>