[SciPy-User] peer review of scientific software

josef.pktd@gmai... josef.pktd@gmai...
Sun Jun 2 06:49:47 CDT 2013


On Sun, Jun 2, 2013 at 7:00 AM, Thomas Kluyver <takowl@gmail.com> wrote:
> On 2 June 2013 06:47, Matthew Brett <matthew.brett@gmail.com> wrote:
>>
>> The person who is trying to do work in Excel, that should be done in a
>> programming language, needed that training.  They will be doing slower
>> work. and make more errors for the lack of a small amount of training.
>
>
> I agree with the argument, but let's not understate the amount of learning
> involved. Here, all new PhD students are given a seven day intensive R
> course, by a lecturer who's good enough at teaching R that he makes money
> from running the course elsewhere. That covers the basics, but it certainly
> doesn't mean that they can do anything in R that they would otherwise do in
> Excel. And it doesn't even touch on version control or writing tests. I
> found one of my labmates editing the copy of a modelling script that she'd
> named 'foobar_DONOTEDIT', but I still couldn't persuade her to use version
> control.
>
> I think there's a fascinating question as to why people find Excel so much
> easier than a 'real' programming language, even if they create really
> complex spreadsheets. I think it's a combination of:
>
> - Familiarity: people are taught spreadsheets, and often Excel specifically,
> at school, whereas 'programming' is seen as a kind of geek sorcery.
> - Mingling code and data: I think it's conceptually harder to have your data
> in one place and your analysis in another, even though that's ultimately
> good practice
> - Seeing what you're doing: In Excel, you calculate something by putting a
> formula in a cell. You press enter, and there's the result. In code, you
> store it in a variable, and you have to explicitly ask for it to be
> displayed. If you're calculating 1000 variables in a loop, then it's not
> obvious from the display which one corresponds to which input.

The last point is where I still use Excel or OpenOffice calc. Visual
inspection of a larger amount of heterogeneous data.

for another area where excel use is still very heavy
http://robertkugel.ventanaresearch.com/2013/01/29/the-spreadsheet-and-the-whale/
via http://blog.enthought.com/?p=113067
(the advantages and perils of using Excel when you bet a few million
dollars on the outcome.)

>
> Can we mix some of that comfort with the robustness we're used to in
> conventional code? E.g. I can imagine a different kind of spreadsheet tool,
> where instead of putting formulae in cells, you define new columns and
> tables, and where you can save the steps you've done to apply to another
> data file in the same format. Perhaps it could even naturally progress to
> real code so that it acts as a kind of gateway drug for programming.

Stata has a very good combination,
point and click for the commands that do the statistics or data
handling, then the commands are printed to the console.
The results can be seen in the console or the dataframe viewer.
(in matlab the plot wizard works similar, point and click and save to script)
It's easy to build up a collection of reproducable, reusable scripts this way.

This was great for me as beginner or when I use parts that I don't
know or remember (or the syntax and options for it).
In contrast, a new plot in matplotlib is a few hours of reading
documentation and googling for examples.

Josef

>
> Thomas
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


More information about the SciPy-User mailing list