[SciPy-User] convertin from octave/ matlab to scipy
josef.pktd@gmai...
josef.pktd@gmai...
Thu Feb 18 22:29:06 CST 2010
On Thu, Feb 18, 2010 at 10:56 PM, <josef.pktd@gmail.com> wrote:
> On Thu, Feb 18, 2010 at 4:58 PM, eat <e.antero.tammi@gmail.com> wrote:
>> Hi,
>>
>> I'll just have started to study and learn the python/ scipy.
>> I have written some inhouse (exploratory) data analysis
>> octave/ matlab toolboxes/ packages that I now like to convert
>> to python/ scipy.
>>
>> I do have googled and skimmed bunch of documents, but still
>> I'll like to have experts opinion how to convert this (typical)
>> octave/ matlab indicies idiom to scipy:
>>
>>
>> """
>> a simple minded conversion of idiom 1 from octave(or matlab)
>> <ocode>
>> function t1
>> m= 12; n= 123456; D= randn(m, n);
>> f= @(X) sqrt(sum(X.^2)); g= @(D) zm(f, D);
>> norm(f(D)- g(D))
>>
>> function D= zm(f, D)
>> m= mean(D, 2); D= f(D- m(:, ones(1, size(D, 2))));
>> </ocode>
>> to scipy:
>> """
>>
>> from scipy import *
>> from numpy.linalg import norm #???
>>
>> def zm(f, D):
>> a= vstack(range(D.shape[0]))* ([1]* D.shape[1]);
>> m= mean(D, 1); return f(D- m[a])
>>
>> m= 12; n= 123456; D= random.randn(m, n);
>> f= lambda X: sqrt(sum(X**2, 0)); g= lambda D: zm(f, D)
>> print norm(f(D)- g(D))
>>
>>
>> I think my main question is:
>> do we have in scipy some 'under the hood' functionality, which
>> allows us not to repmat the constant vector (m)?
>
> broadcasting is one of the best features of numpy compared to matlab.
> Usually there are no repmats necessary, but often we have to add an
> axis to match up the shape of the arrays
>
>>>> import numpy as np
>>>> a = np.arange(6).reshape(2,-1)
>>>> a
> array([[0, 1, 2],
> [3, 4, 5]])
>>>> a.mean()
> 2.5
>>>> a.mean(0)
> array([ 1.5, 2.5, 3.5])
>>>> a.mean(1)
> array([ 1., 4.])
>
>>>> a - a.mean(0)
> array([[-1.5, -1.5, -1.5],
> [ 1.5, 1.5, 1.5]])
>>>> a - a.mean(1)[:,np.newaxis] #add axis back in that we lost in reduce operation mean, equivalent with [:,None]
> array([[-1., 0., 1.],
> [-1., 0., 1.]])
>
> here are some explanations:
> http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
> http://www.scipy.org/EricsBroadcastingDoc
>
> In your code the line with a= vstack(range(D.shape[0]))* ([1]* D.shape[1]);
> is not necessary, but adding the newaxis if the mean is taken with
> respect to an axis > 0
> This should also make it quite a bit faster.
>
> I don't understand why you use anonymous function in matlab and lambda
> in python, but maybe that's because this is out of context. It's not
> easy to see what the actual flow of execution is, or maybe it's
> because I'm not used to having several statements on a line separated
> by semicolon.
I take this back, it's a bit difficult to read but I have a similar
pattern of code in matlab for nested functions.
maybe something like this (typed in editor, so it might not work)
def t1():
m, n = 12, 123456
D = random.randn(m, n)
def f(X):
return sqrt(sum(X**2, 0))
print norm(f(D) - zm(f, D))
#or ?
#g = lambda D: zm(f, D)
#print norm(f(D)- g(D))
def zm(f, D):
m = mean(D, 1)[:, None]
return f(D- m)
>
> Josef
>
>>
>> (Code like above is handled quite well in matlab where
>> (e(matlab)/ e(octave)<~ .1), compared to (e(scipy)/ e(octave)<~ .5),
>> where e(x) is some measure of the execution time in x.)
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
More information about the SciPy-User
mailing list