Robert Kern robert.kern at gmail.com
Sat Sep 23 19:18:03 CDT 2006

Ryan Krauss wrote:
> How would I write good tests?

Whole books have been written on this subject, so I will only try to distill
given by Greg Wilson's excellent Software Carpentry lectures available here:

http://swc.scipy.org/
http://swc.scipy.org/lec/unit.html

The first kind of tests you need to write are "Did I write my algorithm
correctly?" tests (or as I like to call them, "bozo tests"). Test for everything
that could possibly go wrong. For this wave, you can assume that setting an
attribute works, for example. Check that the obvious boundary cases work. Check
that simple cases that you can do in your head work. Check the general case
output for consistency if it is untenable to give the precise value in the test
case (e.g. if you were testing a sorting function, generate a large shuffled
list, sort it, and check that all of the values are increasing). Check that
input which is supposed to fail, fails in the way that you expect. If there are
analytical solutions to simple inputs that the numerical solution should be able
to match within a given accuracy, check that they do.

One thing that you should never do at this point is run the code and copy its
output. You are trying to verify that the code you wrote is correct. If you
can't specify what the output for these tests should be without running the
code, then you don't understand the algorithm.

At this point, you should stop and take a look at your tests. Make sure they
cover everything that needs to be covered. At the very bare minimum, every line
of code under test should be exercised. There are several tools for helping you
do this, coverage.py is fairly canonical:

http://www.nedbatchelder.com/code/modules/coverage.html

but I like the HTML reports and command-line options of trace2html better:

http://cheeseshop.python.org/pypi/trace2html

The algorithms for determining what lines are "executable" and which lines
actually got executed are not perfect, so you might want to try both.

If you find that there is some paragraph of code should be tested separately,
but it's in the middle of a method, then now would be a good time to refactor it
out into a separate method. When writing new code, you should be asking
yourself, "How can I test this?" and writing your code such that it is easy to
test. This is called Designing for Testability.

If you find that you rely on other components, you might want to stub them out
with mock objects that return hard-coded values. For example, you might want to
replace poly1d() in order to test just the bode() function. Mock objects can
also be written to record operations that are performed on them in order to test
that your method actually did certain things in the middle of the method.

Tests written with this kind of focus in mind are called "unit tests" because
each test case just tests the specific units involved. Tests which are written
in order to test the interaction of several units are called "integration
tests". Sadly, the term "unit test" often gets thrown around pretty freely to
mean any kind of test. The Python unittest module is really a framework for
doing all of these tests.

The next kind of test that you should consider are "regression tests." These
tests are written to make sure that future changes to the code haven't affected
anything important. If your bozo tests were thorough, you probably won't have to
add much, if anything, here. Now is the time to fill in things which "couldn't
possibly go wrong" but are part of the interface. *Now* you can run your code
and copy the output because you've already verified that the implementation is
correct (right?). If you didn't get around to separating your tests into clean
unit tests, now's the time to write tests that give you some protection (well,
not protection, but perhaps diagnostics) against changes in other units.

*Make sure that everything that is guaranteed by the public interface is correct.*

*Any semantic changes to your code should cause one or more of your tests to fail.*

As for the specifics of writing tests in scipy, there are several utilities in
numpy.testing that you should use for making assertions:

assert_equal -- assert equality
assert_almost_equal -- assert equality with decimal tolerance
assert_approx_equal -- assert equality with significant digits tolerance
assert_array_compare -- assert arrays obey some comparison relationship
assert_array_equal -- assert arrays equality
assert_array_almost_equal -- assert arrays equality with decimal tolerance
assert_array_less -- assert arrays less-ordering

These functions will print out useful information when the assertions fail.

self.assert_((computed == known).all())

# Good!
assert_array_equal(computed, known)

You can look at the tests that I just added for interp1d for some examples. Of
course, your bode() method is simple enough that your tests won't be nearly as
extensive as those. I'd recommend moving the implementations of lsim() and
lsim2() into the methods and make the functions simply call the method on the
system object (or construct the system object and call the method). Whichever
way you go, the one that calls the other only needs to test that it produces the
same output as the "real" implementation. The interface guarantee that they are
exposing is really "I call the real thing" rather than "I simulate the output of
a continuous-time linear system".

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma