[SciPy-User] bugs in scipy.stats

josef.pktd@gmai... josef.pktd@gmai...
Tue Jun 11 13:21:18 CDT 2013


On Tue, Jun 11, 2013 at 1:58 PM,  <josef.pktd@gmail.com> wrote:
> On Tue, Jun 11, 2013 at 1:24 PM, Oleksandr Huziy <guziy.sasha@gmail.com> wrote:
>> Hi Josef,
>>
>> could you, please, list the functions which need to be tested?
>> And the link to the testing approach that you'd prefer me to use unittest,
>> nose, doctest? I am not experienced tester but really want to help.
>
> Hi Sasha,
>
> I didn't run the test coverage on scipy.stats in a long time
> This was my old list (2009) which is very outdated
> https://github.com/scipy/scipy/issues/1554
>
> All tests are run with nose, scipy doesn't have doctests. The testing
> guidelines are at
> https://github.com/numpy/numpy/blob/master/doc/TESTS.rst.txt
>
> The pattern for the tests can be seen in the test suite
> https://github.com/scipy/scipy/tree/master/scipy/stats/tests
> especially test_stats.py and test_morestats.py, and those for mstats
>
> One check that would also be very helpful is to try out different
> kinds of arguments.
> For example, I think there might still be problems with 2d arrays in
> some functions. Some will raise ValueErrors if they cannot handle 2d
> arrays, but some might just return incorrect numbers.
>
> Example: I never looked closely at `mood` which has unit tests, so a quick try:
>>>> stats.mood(np.random.randn(10,2), np.random.randn(15,2))
> (26.664783935766987, 1.2060935978310698e-156)
>>>> stats.mood(np.random.randn(10), np.random.randn(15))
> (-0.46553454010068451, 0.64154870791874163)
>
> the first result looks pretty weird
>
> In these cases we should add a `raise ValueError` or try to enhance it to 2d.

If you check a function and it works as advertised, then this would
also be good to know
We still have an old milestone for the stats review, where we can also
note that everything is fine
https://github.com/scipy/scipy/issues?milestone=4&state=open

or open a new issue and note the functions that you checked there, so
we have a record.

Other hypothesis test that I never looked at in detail and tried out
only with "nice" numbers are
fligner, ansari, bartlett, ...

>>> x, y = np.random.randn(10,2), np.random.randn(15,2)
>>> stats.bartlett(x, y)
(4.7695839486013287e-05, 0.99448967952524281)
>>> stats.bartlett(x.ravel(), y.ravel())
(0.00010231554134127248, 0.99192944393408666)

looks also wrong

(ttests are vectorized, ks tests raise an exception somewhere in the
code with 2d)

Josef


>
> Thank you,
>
> Josef
>
>>
>> Cheers
>> --
>> Oleksandr (Sasha) Huziy
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>


More information about the SciPy-User mailing list