[IPython-dev] [sympy] Re: using reST for representing the notebook cells+text

Robert Kern robert.kern@gmail....
Wed Feb 24 12:57:12 CST 2010

On Wed, Feb 24, 2010 at 12:39, Brian Granger <ellisonbg@gmail.com> wrote:
>>> * No one wants to edit XML by hand.
>> I don't want to write Python code in that tree style by hand, either.
> Yes, for the most part, but the argument for reST is that people can
> edit it by hand.  Python is closer to that, but I agree that most people
> won't probably do that.

No, Python is *not* closer to reST than XML. reST is very different
from either XML or the Python object tree. The latter are much more
similar to each other than they are to reST.

>> Too many things to get wrong. I don't see anyone editing notebooks
>> outside of the notebook apps, honestly. Does anyone edit Mathematica
>> notebooks in a text editor? I don't think the success of Mathematica's
>> notebook is the file format; it's the notebook GUI itself.
>>> * The result is not importable in a regular Python shell.
>> <shrug> So you use a function call instead.
> True, that works for getting an in memory rep of the notebook
> (parsing).  But the
> importability feature of a pure python notebook format has a broader
> impact on the design.  This is related
> to validation and extensibility.
> validation: not all XML files are valid notebook files.  Thus, you
> have to validate the XML at some level.  This means 1) either
> developing a schema (which is very rigid) or 2) doing the validation
> in Python once you parse the XML.  In the case of a pure python
> notebook, validation is done by Python itself.  If the notebook can be
> exec'd or imported, it is a valid notebook

And this is what I think is most wrong with using Python to do this.
It's too flexible and too easy to create things that are incompatible
with other representations. It's fine to build things that are
incompatible (and I wouldn't want to prevent you from creating a
Notebook object that was incompatible with some of its
representations), but the main file format is a bad place to make that

> extensibility: using XML requires specifying the XML schema in a
> central location.  That schema is fixed and thus hard to extend.

The X is there for a reason. :-)  It stands for eXtensible. It really
isn't hard to add tags or to write systems that do sensible things
when they see tags that they don't handle. This is exactly what XML
was designed for.

> Using Python allows the notebook format to be extended by simple
> subclassing:
> class MatplotlibFigureCell(Cell):

<mpl:cell extends="cell"></mpl:cell>

> This subclass could contain all the logic for representing the
> matplotlib figure in different formats: html, jpeg, svg, native pyqt
> gui, etc and could be distributed with matplotlib.
> Sure, if your notebook uses the MatplotlibCell, you will need
> matplotlib, but the validation of that aspect is handled by python
> itself, not an XML schema.

*You don't need XML schemas to validate XML.* They are often entirely
superfluous. Now, you will need a Python API that builds Notebook
instances with Cells and all that jazz. And you will need a way to
build that object tree from XML. *That* is your validation. Algorithm
succeeds == valid file. The same is actually true for any other

> Doing these types of things using XML would require all of this to be
> put into the centralized XML schema, making it difficult for third
> parties to extend.  Plus all of the additional logic associated with
> that Cell type has to be put somewhere as well.

Even if you do develop an XML schema to use, these are easily made
extensible. Providing a mapping from tag names to classes is a
straightforward way to extend the parser.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

More information about the IPython-dev mailing list