[SciPy-User] peer review of scientific software

Matthew Brett matthew.brett@gmail....
Sun Jun 2 14:51:00 CDT 2013


Hi,

On Sun, Jun 2, 2013 at 11:38 AM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Sun, Jun 2, 2013 at 12:00 PM, zetah <otrov@hush.ai> wrote:
>>
>> Thomas Kluyver wrote:
>> >'type of users' might have been a more accurate phrase, but it has an
>> >unfortunate negative ring that I wanted to avoid. There are a lot of
>> > people
>> >doing important data analysis in quite risky and hard-to-maintain ways.
>> >Using spreadsheets where some simple code might be more reliable is one
>> >symptom of that, and there have been a couple of major examples from
>> >economics where spreadsheet errors led to serious mistakes.
>> >The discussion is revolving roughly around whether and how we can push
>> >those users towards better tools and methods, like coding, version
>> > control
>> >and testing.
>>
>> Thanks for overview Thomas, I read all emails on the subject and will
>> comment briefly, for the sake of my participation, although topic is huge
>>
>> I don't have experience with critical modeling, but I do and learn data
>> analysis with historical data and generally.
>>
>> If we speak about errors, I think that most of it, like taught in
>> Numerical analysis course, are due to human factor not understanding data
>> types and also variety of data sources representing data differently.
>> Trivial example that sql and netcdf databases represent same data in
>> different format. Similarly for other data sources which in turn can be just
>> plain text dumps. If that is handled correctly and user is familiar with the
>> tool used, there shouldn't be any surprises.
>
>
> At least when no one checks ;) The errors that the gods of analysis gift to
> us are often hidden away and are easy to overlook. They also tend to creep
> in when one is overconfident. It's all part of the devine sense of humor.

Yes - when no-one checks!

I wish I still shared the feeling that mostly when I do stuff it's
correct, or mostly correct, or correct enough.  It was only when I
started checking that I started to worry.  I well remember the happier
times I'd write a 100 line analysis script with no tests and be
"pretty sure" that it was correct.

Cheers,

Matthew


More information about the SciPy-User mailing list