[SciPy-User] zscore axis functionality is borked
Warren Weckesser
warren.weckesser@enthought....
Wed Nov 30 15:10:26 CST 2011
On Wed, Nov 30, 2011 at 3:05 PM, <josef.pktd@gmail.com> wrote:
> On Wed, Nov 30, 2011 at 4:02 PM, Warren Weckesser
> <warren.weckesser@enthought.com> wrote:
> >
> >
> > On Wed, Nov 30, 2011 at 2:54 PM, <josef.pktd@gmail.com> wrote:
> >>
> >> On Wed, Nov 30, 2011 at 3:45 PM, <josef.pktd@gmail.com> wrote:
> >> > On Wed, Nov 30, 2011 at 3:25 PM, Alacast <alacast@gmail.com> wrote:
> >> >> axis=0 (the default) works fine. axis=1, etc, is clearly wrong. Am I
> >> >> misunderstanding how to use this, or is this a bug?
> >> >>
> >> >> In [16]: i = rand(4,4)
> >> >>
> >> >> In [17]: i
> >> >> Out[17]:
> >> >> array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358],
> >> >> [ 0.70199066, 0.81872151, 0.47357357, 0.20425537],
> >> >> [ 0.31042673, 0.25837984, 0.73550134, 0.57970176],
> >> >> [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]])
> >> >>
> >> >> In [18]: zscore(i, axis=0)
> >> >> Out[18]:
> >> >> array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907],
> >> >> [ 0.59653471, 1.38544585, 0.39284654, -1.55756529],
> >> >> [-1.22271057, -0.94164388, 1.39942427, 0.37494213],
> >> >> [-0.67511172, 0.51815526, -1.27107939, 1.19716222]])
> >> >>
> >> >> In [19]: zscore(i[:,0])
> >> >> Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, -0.67511172])
> >> >>
> >> >> In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0]
> >> >> Out[20]: array([ True, True, True, True], dtype=bool)
> >> >>
> >> >> In [21]: zscore(i, axis=1)
> >> >> Out[21]:
> >> >> array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906],
> >> >> [-1.6379836 , -1.52125275, -1.86640069, -2.13571889],
> >> >> [-2.09968257, -2.15172946, -1.67460796, -1.83040754],
> >> >> [-1.29796925, -1.11637205, -1.68566481, -0.98681582]])
> >> >> #The above is obviously wrong, as everything has a negative z score
> >> >>
> >> >> In [22]: zscore(i[0,:])
> >> >> Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, 0.16925757])
> >> >>
> >> >> In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:]
> >> >> Out[23]: array([False, False, False, False], dtype=bool)
> >> >> #Using axis=1 produces different results from taking a row directly.
> >> >>
> >> >> In [24]: zscore(i, axis=-1)
> >> >> Out[24]:
> >> >> array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906],
> >> >> [-1.6379836 , -1.52125275, -1.86640069, -2.13571889],
> >> >> [-2.09968257, -2.15172946, -1.67460796, -1.83040754],
> >> >> [-1.29796925, -1.11637205, -1.68566481, -0.98681582]])
> >> >> #Getting rows by using axis=-1 is no better (this is the same result
> as
> >> >> axis=1
> >> >
> >> > This looks like a serious bug to me. I don't know what happened here
> (.
> >> >
> >> > The docstring example also has negative numbers only.
> >> >
> >> > ???
> >> >
> >> > I'm looking into it
> >> >
> >> > Thanks for reporting
> >>
> >> a misplaced axis: if axis>0
> >> then it calculates x - mean/std instead of (x - mean) / std
> >>
> >> now, how did this go through the testing ?
> >
> >
> >
> >
> > There is only one test for zscore, on a 1-d sample without the axis
> keyword.
>
> which just show that we shouldn't trust changesets that say
>
> "stats: rewrite of zscore functions, ticket:1083 regression tests
> pass, still need tests for enhancements"
>
> http://projects.scipy.org/scipy/changeset/6169
>
> my mistake (maybe January 2nd wasn't a good day.)
>
> Josef
>
>
Thanks for the link. Looks like zmap has the same bug. :(
Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20111130/f1f99b12/attachment.html
More information about the SciPy-User
mailing list