[SciPy-User] peer review of scientific software

Thomas Kluyver takowl@gmail....
Sun Jun 2 06:00:25 CDT 2013


On 2 June 2013 06:47, Matthew Brett <matthew.brett@gmail.com> wrote:

> The person who is trying to do work in Excel, that should be done in a
> programming language, needed that training.  They will be doing slower
> work. and make more errors for the lack of a small amount of training.
>

I agree with the argument, but let's not understate the amount of learning
involved. Here, all new PhD students are given a seven day intensive R
course, by a lecturer who's good enough at teaching R that he makes money
from running the course elsewhere. That covers the basics, but it certainly
doesn't mean that they can do anything in R that they would otherwise do in
Excel. And it doesn't even touch on version control or writing tests. I
found one of my labmates editing the copy of a modelling script that she'd
named 'foobar_DONOTEDIT', but I still couldn't persuade her to use version
control.

I think there's a fascinating question as to why people find Excel so much
easier than a 'real' programming language, even if they create really
complex spreadsheets. I think it's a combination of:

- Familiarity: people are taught spreadsheets, and often Excel
specifically, at school, whereas 'programming' is seen as a kind of geek
sorcery.
- Mingling code and data: I think it's conceptually harder to have your
data in one place and your analysis in another, even though that's
ultimately good practice
- Seeing what you're doing: In Excel, you calculate something by putting a
formula in a cell. You press enter, and there's the result. In code, you
store it in a variable, and you have to explicitly ask for it to be
displayed. If you're calculating 1000 variables in a loop, then it's not
obvious from the display which one corresponds to which input.

Can we mix some of that comfort with the robustness we're used to in
conventional code? E.g. I can imagine a different kind of spreadsheet tool,
where instead of putting formulae in cells, you define new columns and
tables, and where you can save the steps you've done to apply to another
data file in the same format. Perhaps it could even naturally progress to
real code so that it acts as a kind of gateway drug for programming.

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20130602/a5b88f92/attachment-0001.html 


More information about the SciPy-User mailing list