[Numpy-discussion] speeding up y[:, i] for y.shape = (big number, small number)
David Cournapeau
david at ar.media.kyoto-u.ac.jp
Wed Oct 4 05:58:25 CDT 2006
Tim Hochberg wrote:
>>
>>
>> I guess the problem is coming from the fact that y being C order, y[:,
>> i] needs accessing data in a non 'linear' way. Is there a way to speed
>> this up ? I did something like this:
>>
>> y = N.zeros((K, n))
>> for i in range(K):
>> y[i] = gauss_den(data, mu[i, :], va[i, :])
>> return y.T
>>
>> which works, but I don't like it very much.
> Why not?
>
Mainly because using those transpose do not really reflect the
intention, and this does not seem natural.
>> Isn't there any other way
> That depends on the details of gauss_den.
>
> A general note: for this kind of microoptimization puzzle, it's much
> easier to help if you can post a self contained example, preferably
> something fairly simple that still illustrates the speed issue, that we
> can experiment with.
>
Here we are (the difference may not seem that much between the two
multiple_ga, but in reality, _diag_gauss_den is an internal function
which is done in C, and is much faster... By writing this example, I've
just realized that the function _diag_gauss_den may be slow for exactly
the same reasons):
#! /usr/bin/env python
# Last Change: Wed Oct 04 07:00 PM 2006 J
import numpy as N
from numpy.random import randn
def _diag_gauss_den(x, mu, va):
""" This function is the actual implementation
of gaussian pdf in scalar case. It assumes all args
are conformant, so it should not be used directly
Call gauss_den instead"""
# Diagonal matrix case
d = mu.size
inva = 1/va[0]
fac = (2*N.pi) ** (-d/2.0) * N.sqrt(inva)
y = (x[:,0] - mu[0]) ** 2 * inva * -0.5
for i in range(1, d):
inva = 1/va[i]
fac *= N.sqrt(inva)
y += (x[:,i] - mu[i]) ** 2 * inva * -0.5
y = fac * N.exp(y)
def multiple_gauss_den1(data, mu, va):
"""Helper function to generate several Gaussian
pdf (different parameters) from the same data: unoptimized version"""
K = mu.shape[0]
n = data.shape[0]
d = data.shape[1]
y = N.zeros((n, K))
for i in range(K):
y[:, i] = _diag_gauss_den(data, mu[i, :], va[i, :])
return y
def multiple_gauss_den2(data, mu, va):
"""Helper function to generate several Gaussian
pdf (different parameters) from the same data: optimized version"""
K = mu.shape[0]
n = data.shape[0]
d = data.shape[1]
y = N.zeros((K, n))
for i in range(K):
y[i] = _diag_gauss_den(data, mu[i, :], va[i, :])
return y.T
def bench():
#===========================================
# GMM of 30 comp, 15 dimension, 1e4 frames
#===========================================
d = 15
k = 30
nframes = 1e4
niter = 10
mode = 'diag'
mu = randn(k, d)
va = randn(k, d) ** 2
X = randn(nframes, d)
print "============================================================="
print "(%d dim, %d components) GMM with %d iterations, for %d frames" \
% (d, k, niter, nframes)
for i in range(niter):
y1 = multiple_gauss_den1(X, mu, va)
for i in range(niter):
y2 = multiple_gauss_den2(X, mu, va)
se = N.sum(y1-y2)
print se
if __name__ == '__main__':
import hotshot, hotshot.stats
profile_file = 'foo.prof'
prof = hotshot.Profile(profile_file, lineevents=1)
prof.runcall(bench)
p = hotshot.stats.load(profile_file)
print p.sort_stats('cumulative').print_stats(20)
prof.close()
I am a bit puzzled by all those C vs F storage, though. In Matlab, where
the storage was always F as far as I know, I have never encountered such
differences (eg between y(:, i) and y(:, i)); I don't know if this is
because I am doing it badly, or because matlab is much more clever than
numpy at handling those cases, or if that is the price to pay for the
added flexibility of numpy...
David
More information about the Numpy-discussion
mailing list