[IPython-dev] Project idea: Automatic lab notebook for iPython

Peter Macko pmacko@eecs.harvard....
Fri Apr 19 16:08:58 CDT 2013


Hi Nitin,

Thanks for the suggestion. Version control systems, such as git, will solve a good chunk of the problem we are trying to address, but not everything: git would record files at particular user-selected points in time and store them with user-supplied annotations, but it would not record how those files were produced, such as which programs produced them, what parameters were used, etc., unless the users manually write this in their commit messages. But that being said, git in iPython is a cool idea.

-Peter


On Apr 18, 2013, at 8:45 PM, Nitin Borwankar wrote:

> Hi Peter,
> 
> Is there something in your concept of provenance that git or a similar dvcs does not cover?
> It would seem that the kind of granular history tracking built into git could be taken advantage of with a small amount of semantic glue on top to give you what you want.  Plus you get a large global user community. 
> There are well defined API's wrapping a git client lib that would make this feasible.  If this makes sense please feel free to connect via nborwankar on gmail.
> I've spent non-trivial amount of time exploring how git could be used as a database-with-history for heterogenous data.
> 
> Nitin Borwankar
> 
> 
> ------------------------------------------------------------------
> Nitin Borwankar 
> nborwankar@gmail.com
> 
> 
> On Tue, Apr 16, 2013 at 4:38 PM, Peter Macko <pmacko@eecs.harvard.edu> wrote:
> Hi iPython developers,
> 
> Here is a new project idea: automatic lab notebook for iPython and
> iPython Notebook, which would keep track of how each of your output
> files was produced, linking this "history" (or a "lineage") of an object
> across different iPython sessions and different iPython notebooks, and
> storing it persistently. This is frequently referred to in the Computer
> Science literature as "provenance."
> 
> It will enable you to ask questions like "what did I do to produce this
> plot?" - and for example, it will tell you that you downloaded the input
> data set on Monday from such and such website, you ran all these
> commands to process the data on Tuesday, and then produced this plot on
> Thursday from a different iPython session. Note that this goes beyond
> (and is complementary in purpose to) iPython Notebook, since the history
> of a file is tracked across different sessions and Notebooks, and when
> you ask a question, you will get only the relevant information,
> suppressing any additional things that you did that are unrelated to the
> file in which you are interested.
> 
> We are in touch with computational scientists all the way from
> bioinformatics to physics that are very interested in this feature! We
> met their needs partially by developing a cross-platform, multi-lingual
> library (https://code.google.com/p/core-provenance-library/) that they
> can use to annotate their Python (and non-Python) scripts in order to
> track the lineage of their objects.
> 
> Our vision is that this will be all done fully automatically, without
> requiring the users to manually annotate their scripts. But
> unfortunately neither of us who are involved in this project has the
> resources or the knowledge of the iPython code-base to tackle this
> challenge. We need your help to make this happen! We have some ideas
> about how we might go about this, but we need someone who knows more
> about iPython to talk them over and to spearhead the actual development.
> Please let us know if you can help!
> 
> Thank you,
> 
> Peter Macko
> 
> Harvard School of Engineering and Applied Sciences
> 33 Oxford St.
> Cambridge, MA 02138
> 
> _______________________________________________
> IPython-dev mailing list
> IPython-dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
> 
> _______________________________________________
> IPython-dev mailing list
> IPython-dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-dev/attachments/20130419/0e38959f/attachment.html 


More information about the IPython-dev mailing list