[IPython-User] database magics, saving notebooks in the database and smallest quantum of shareability etc
Mon Feb 25 04:11:40 CST 2013
Thanks for the clarifications re security etc. and your roadmap.
While Mongo or SQL databases are space inefficient, whether to use them or
not depends on whether you want the other things they provide. One doesn't
generally use them for *efficient* space storage.
More for integration into MVC webapp frameworks, existing reporting and
content management tools etc.
But if storage efficiency dominates your trade-offs then the file system is
probably the best.
On Sun, Feb 24, 2013 at 10:31 PM, Brian Granger <firstname.lastname@example.org> wrote:
> On Sat, Feb 23, 2013 at 4:38 PM, Nitin Borwankar <email@example.com>
> > I would like to connect with the IPython team at PyData 2013 in
> > next month as I have interest in doing the following and would like to
> > co-ordinate :-
> > a) want to create a robust plugin framework for database magics (SQL and
> > NoSQL)
> > I have Postgres working (%%PGDB) right now and aim to do MySQL and Mongo
> > (and anything else based on community feedback).
> > Essentially (after some database config) you run
> > "%PGDB <sql query>" in IPyNB
> > this executes the query on a PG instance (in your config) via
> > voodoo to the psql client (needs to be locally installed) and returns the
> > results set in the output visually as a text based table.
> > I am hoping to put out the PGDB magic on github before PyData.
> > Note that I make no attempt to parse SQL or understand the string - if
> > mess up you will hear from the database at the other end as if you were
> > sitting at a console in a terminal. I am just the middle man.
> > Or rather %%PGDB is.
> > Some work is needed to return the result set as a Py dict that is
> visible in
> > the name space so it can be used by other code.
> > There are some issues to deal with re: keeping a persistent connection so
> > that multiple requests don't open more and more connections - this is now
> > running at a basic (optimistic scenario) level but has not been tested
> > all kinds of bad scenarios.
> > I have used standard "magics" metaphors and just dropped the code in the
> > right place and did some config. The integration with the IPyNB system
> > not trivial but it is straightforward enough that it is not rocket
> > I was very pleased it worked after some elementary brain twisting :-)
> > Kudos to the IPy team for making extensibility straightforward - this is
> > HUGE WIN.
> > b) want to save JSON .ipynb in Mongo, Postgres key-value store and other
> > JSON stores instead of the filesystem in current directory.
> Our notebook manager class is designed to make it possible to add
> other storage backends. Right now we have a file system based one and
> another based on Azure blob storage. It shouldn't be too difficult to
> create a mongodb based one. However, I think Mongodb is a horrible
> choice to use for entire notebooks. Notebooks can be really big and
> mongodb is not very space efficient.
> > c) want to be able to compose (note *compose* not edit) Notebooks via a
> > UI where a user can assemble content chunks (JSON in Mongo) or .... and
> > "publish" to .ipynb compliant JSON.
> > This will allow massive reuse of working content chunks especially those
> > that involve code examples and diagrams needing reproduceability.
> > My motivation for doing this is
> > 1) IMHO, the filesystem based storage is subject to OS level security
> > and the database storage *may* (huge big MAY) provide some mitigation.
> > Caveat being don't naively assume that database security is the whole
> > answer.
> The big security issues related to the notebook are not the file
> system, but the fact that users can run arbitrary code. Throwing
> notebooks in a db won't help that at all. In fact, because users can
> run arbitrary code, you risk them being able to hack your mongodb
> > 2) while the quantum of shareability right now is the single notebook,
> > is awesome in itself, this can be taken even further. So if one wishes, a
> > notebook can be published as a sequence of quasi-atomic chunks which can
> > then be separately mixed and mashed. The quasi-atomic means that we
> > one further level of granularity inside a notebook - the boundaries being
> > orthogonal to actual content boundaries. i.e. we should not say e.g. that
> > "use a horizontal rule" as a chunk marker - this is brittle etc.
> We have long term plans to think about allowing notebooks to be saved
> and loaded on a cell-by-cell basis, but this is pretty far off still.
> At this point, I think you should try to implement everything you want
> on your own outside of IPython proper. In the long term, you may find
> that IPython moves in some of these directions, but we are focused on
> other things right now:
> > 3) Content management becomes easy *in some respects* when the
> > is smaller.
> > Thanks for reading this far.
> > ------------------------------------------------------------------
> > Nitin Borwankar
> > firstname.lastname@example.org
> > _______________________________________________
> > IPython-User mailing list
> > IPython-User@scipy.org
> > http://mail.scipy.org/mailman/listinfo/ipython-user
> Brian E. Granger
> Cal Poly State University, San Luis Obispo
> email@example.com and firstname.lastname@example.org
> IPython-User mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-User