[SciPy-user] Numeric array functionality for cloning matlab's zscore(matrix)

David M. Cooke cookedm at physics.mcmaster.ca
Thu Aug 11 13:50:57 CDT 2005

Syd Diamond <syd.diamond at gmail.com> writes:

> Long time scipy fan, first time poster :)
> Anyways, I've always dealt with list objects in the past, but as
> strange as this might sound, I'm not up to par on manipulating array
> objects.  I'm need eigenvector functionality for a project I'm working
> on, and so I started to work with the arrays more directly.
> The limitations of my expertise became clear when my clone of the
> simple zscore matlab function got me the right results but in a very
> ugly manner.  For one, when I declared a matrix of zeros, when I
> defined elements as X[i,j] I could not change the type from int to
> float.
> In [10]: X = zeros((2,2))
> In [11]: X[0,0] = 1.5   
> In [12]: X[0,0]
> Out[12]: 1

zeros, like most (all?) of the array constructors takes a 'typecode'
argument (or 'type' in numarray). You'll want

In [1]: import Numeric
In [2]: X = Numeric.zeros((2,2), typecode=Numeric.Float)
In [3]: X[0,0] = 1.5
In [4]: X[0,0]
Out[4]: 1.5

In your code below you construct Z, so you'd probably want to match
X's typecode:

Z = Numeric.zeros((2,2), typecode=X.typecode())

> Well, that's just one thing.  You'll see that my code below is ugly
> (although it gives me the right result).  I researched online, looked
> at the source code for various scripts using Numeric, and tried to
> find an irc channel, but I never got what I needed.
> ** Can someone please help me better understand arrays and perhaps
> suggest a better way to implement this simple zscore function?  Thank
> you.

The Numeric documentation is at http://numeric.scipy.org/ . You may
also want to look at the numarray documentation also: it should for
the most part be similiar to Numeric, and may explain some points better.

>>> help zscore
>  ZSCORE Standardized z score.
>     Z = ZSCORE(X) returns a centered, scaled version of X, known as the Z
>     scores of X.  For a vector input, Z = (X - MEAN(X)) ./ STD(X).  For a
>     matrix input, Z is a row vector containing the Z scores of each column
>     of X.  For N-D arrays, ZSCORE operates along the first non-singleton
>     dimension.
> MY CODE (which matches the matlab results, but is _ugly_
> from Numeric import array, shape, transpose, zeros
> from scipy import cov, mean, std
> A = array([[1, 2.3, 3],[1.1, 2.2, 2.9],[0.9, 1.9, 3.3]])
> def zscore(X):
>   #Z = zeros(shape(X)) # can't get rid of the ints
>   Z = X.__deepcopy__(X)
>   X = transpose(X)
>   for i in xrange(shape(X)[0]):
>     for j in xrange(shape(X)[1]):
>       Z[j,i] = float((X[i,j] - mean(X[i])) / std(X[i]))
>   return Z 
> print zscore(A)

from Numeric import array
from scipy import mean, std

A = array([[1, 2.3, 3],[1.1, 2.2, 2.9],[0.9, 1.9, 3.3]])

def zscore(X):
    Z = (X - mean(X)) / std(X)
    return Z

print zscore(A)

[[ 0.        , 0.80064077,-0.32025631,]
 [ 1.        , 0.32025631,-0.80064077,]
 [-1.        ,-1.12089708, 1.12089708,]]

You'll have to understand array broadcasting to see how that works.
If X has a shape (5,10), then mean(X) has a shape (10,). When the two
are subtracted, mean(X) is expanded (not really, but it acts that way)
to a shape (5,10), where the rows are replicated 5 times. The two then
can be subracted elementwise. Same for dividing by std(X).

|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca

More information about the SciPy-user mailing list