[IPython-dev] migration issues, embedding latest IPython in another application
Wed Jul 11 14:58:35 CDT 2012
Hello IPython Developers,
We have an application which embeds an IPython shell for occasional use (the program's normal operation uses a standard library cmd.Cmd subclass, and ipython can be invoked as a do-command from that). We've been using IPython v0.10.2 for well over a year, and I'm currently evaluating upgrading to the newest release, IPython v0.13. I wanted to relate some issues I've encountered, and ask a question or two.
The first issue encountered was the config system throwing a ConfigError (not an ImportError) when the top level IPython package was imported (IPython/__init__.py). Another developer on my team had altered the __builtin__ module to include a symbol "pprint", and this caused a collision with the name "pprint" used in the IPython configuration system (which refuses to set items if they are present in the __builtin__ module). I know it is not best practice to alter the __builtin__ module, but in the context of embedding a shell into another application, is it safe to assume that the __builtin__ module has not been altered? We solved the collision issue by moving the debugging symbols (the additions to the __builtin__ module were meant for convenience in debugging) into a more obscure namespace, but we find it frustrating that IPython imposes this requirement by virtue of it imported, much less used. Is this behavior / requirement ("do not alter the __builtin__ module if embedding IPython") documented somewhere? If not, I would suggest that it be. Also, please consider catching this error in IPython/__init__.py and re-raising ImportError, if this happens at import-time.
Several bigger issues I encountered revolve around threading. Our application can run headless, and the cmd.Cmd shell (when active) does not run in the main thread. The current version of IPython installs at-exit handlers when a shell is created or starts up, including functionality to close/save the history session when the program exits. An exception gets thrown during the exit process though: "ProgrammingError: SQLite objects created in a thread can only be used in that same thread." The ipython shell created the sqlite db object in the cmd.Cmd's thread, and the at-exit handlers always (or at least in most cases / our case) run in the main thread.
I understand that there is a workaround to open a sqlite3 connection in a way that suppresses this behavior, but IPython doesn't open the connection in that way by default, and doesn't seem to provide a hook to do this short of subclassing the TerminalInteractiveClass or the InteractiveShellEmbed class and writing one's own init_history() method, and further subclassing the HistoryManager and overriding its init_db() method.
So, it seems like it is a currently, more or less, a requirement that an embedded IPython shell can only be run in the main thread (at least without lots of effort to make it work otherwise). Is this in fact the case, or am I missing something? Is this documented somewhere?
Also, I'm not certain if the IPython.embed() function works as intended. It calls the InteractiveShellEmbed class directly to construct one, not the InteractiveShellEmbed.instance() method. So an interpreter constructed in this way does not participate in the singleton system, and the first time one of many things happen in the interactive loop (such as an exception being thrown), *another* InteractiveShell object gets constructed that does not share the config that the InteractiveShellEmbed class was passed. I discovered this after configuring the HistoryManager.hist_file configuration setting to be ":memory:", and then running a pdb.set_trace() when the IPython.core.history.HistorySavingThread was being constructed (which should not happen if the hist_file is ":memory:"). The singleton that does get constructed was trying to open a connection to the default history db file location, and so I assume it was constructed with no configuration parameters (at least not with the ones I passed in).
Is that the desired behavior? Or should IPython.frontend.terminal.embed.embed() be calling InteractiveShellEmbed.instance(**kwargs) instead of InteractiveShellEmbed(**kwargs)?
I was debugging in that way because we want to have tighter control over what threads are running in our application, and I didn't want IPython to start one simply for the history saving feature (which we really do not require in our application).
Is there any non-monkey patching way to turn off the history database subsystem entirely? I'd be happy to have it operate in ":memory:" if it didn't cause the cross-thread database close error on shutdown, but currently it does. Also, our user home directories are on NFS, and we've definitely encountered the sqlite NFS locking bug mentioned in the SQLite FAQ (http://www.sqlite.org/faq.html). It is infrequent, but it can (and does) happen at scale. Because we've been bitten, I'm wary of putting a sqlite database on an NFS filesystem when there are multiple processes reading from and writing to that database.
I've settled on the following to more or less accomplish this:
IPython.core.history.sqlite3 = None
IPython.core.history.HistoryAccessor.db = None # necessary because of Instance descriptor
IPython.core.history.warn = do_not_warn
before creating an embedded IPython shell. But this is of course version / implementation specific, and therefore brittle (and kind of ugly). I'm hoping there's a better way.
And, because the embed() function works the way that it does, I'm currently doing the following (more or less) to embed the shell in our application:
ipshell = InteractiveShellEmbed.instance(config=cfg)
Please let me know if I'm approaching any of the problems I ran into in the wrong way, and missing some already existing solution / IPython usage idiom. Also, I hope that describing the problems I encountered while migrating might be useful to you all.
We're looking to make the switch to the newer IPython in the hopes we can leverage some of the distributed kernel/engine features. Our application actually runs in a distributed fashion (popping up worker applications via LSF on other run hosts), but we have only limited visibility into what they're doing while they run; far less than in the main application where we can open an IPython embedded shell and poke around. I'm looking forward to further exploring those, and all the other new feature enhancements since v0.10.2!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-dev