[IPython-user] Trouble importing my own modules?

Brian Granger ellisonbg.net@gmail....
Tue Jun 12 12:13:39 CDT 2007


> I'm having trouble importing my own modules into the ipython1 engines.
>
> First ipython wasn't finding them, in spite of the fact that they were
> in the current working directory of the engines.  That's strange, but
> not a problem--I set PYTHONPATH to include the modules in question and
> the modules were found.
>
> The other problem is that if I just do:
>   import ipython1.kernel.api as par
>   rc = par.RemoteController(('localhost', 10105))
>   rc.executeAll('import analytic')
>
> Then I get the traceback attached below.  However, if I do:
>
>   [rc.execute(i, 'import analytic') for i in rc.getIDs()]

That is very odd.  We have previously seen people having trouble
importing modules on the engines.  But I haven't seen this particular
things.  I would update to svn and retry - the tracebacks are much
better in trunk and that may help us debug the problem.

Here is another trick that someone found was useful in importing things:

executeAll( 'import site' )
executeAll( 'site.addsitedir( ' + `os.getcwd()` + ' )' )
executeAll( 'import themodule; reload( themodule )' )

This is a tough one to debug because I haven't been able to reproduce
it on any of my systems.

> Then it seems to work.  So, I'm reasonably happy since I have a an
> easy workaround, but it is strange.  This is using a CVS checkout from
> May 8.
>
> I also have several comments:
> 1) Thank you for putting this together.  My opinion is that high
> end/parallel computing sucks these days because the whole mindset
> differs from that of desktop computing.  On desktops, you have
> interactive GUI programs and flexible languages (like Python).  On
> high-end computers, you have non-interactive batch job systems,
> laborious/difficult visualization, and Fortran.  Things like ipython1
> are a breath of fresh air.

Thanks so much - it was this situation exactly that led to to begin
ipython1 in the first place.

> 2) Is there a way to get better tracebacks?  When my code generates
> exceptions, the exception is thrown in the bowels of ipython/twisted,
> rather than anything indicating what was my actual mistake.  I realize
> that this may be an impossible task since I pass code to be executed
> as a string and after that Python has no good way of figuring out
> where in the source file it came from.

Two things.  The current svn has much better tracebacks already - as
Fernando mentioned.  The second is that I am working on further
enhancements to the tracebacks in a different branch.  I am hoping to
merge back into trunk has soon as later this week.  I will post to the
list when that is done.

> 3) I understand from mailing list posts that the eventual goal is to
> have the engines running Ipython rather than plain python instances.
> That seems fine as a default, but I'd like to put in a vote for having
> it continue to be possible to run plain python instances on the
> engines.  The reason is a little esoteric...
> There's a program I sometimes need called GDL.  It's an interactive
> data analysis program so it uses readline extensively to try to make
> things easy for the user.  It can be accessed via a Python module,
> which is how I actually use it.  The problem is that IPython also uses
> readline extensively, and the combination causes a seg fault.  I can,
> however, use the GDL python module with plain Python.

Because the engines don't import readline, you shouldn't have a
problem with this - even when the engines become "ipythonized" they
won't import readline.  So you should be OK.

> True, this is not a common situation and it's a bug, so in a perfect
> world it would be fixed and I could use the GDL python module in
> IPython settings.  But it actually works out perfectly that the
> engines run plain python, and since they aren't meant to be directly
> used interactively, it seems to me that there's some virtue in keeping
> them simple.  The GDL people, for example, test their code with plain
> python, not with IPython.
>
> 4) Multiple users.  Do you have any ideas or a preferred model for
> allowing multiple people to connect to the same controller (and
> therefore have access to the same pool of engines?)  That would be
> truly killer.

Currently, multiple users can connect to a single controller.  As
Fernando mentioned, this is something we have had in mind all along.
The only thing that needs to be worked on is the security model.
Currently there is no authentication scheme used.  But that is on our
list of things to do.

> To be concrete, what excites me about ipython1 is the idea of
> interactive data analysis.  For the most part exploration of data is
> limited to what you can do on a single processor because you need to
> hook your application up to a GUI.  GUIs, being event driven, lead to
> the potential of a lot of idle time since the user might have to think
> about what he sees for a while before requesting more action.
> Typically (in my experience) the only way to run something on a large
> computer is to submit a batch job, and then it's supposed to crank
> away like mad, not wait for user input via some connection to a GUI.
> Therefore it's either practically or politically impossible to harness
> a large number of processors for interactive data exploration.

The company I am working at uses ipython1 to to data analysis.  In
some cases, we handle 100s of GB in parallel and use the local ipython
session (where the client is) for visualization.  Works like a charm.

> Hence my interest in multiple users.  Let's say you have a cluster
> with 100 machines and 4 users.  One way to handle this would be to
> give each user a separate controller and 25 engines.  This is nice
> because it insulates users from one another, and a user can be sure
> that when he tells his controller to do something, there will be
> engines available.  However, the downside is that if the four users
> are doing interactive data exploration, then there will be a lot of
> idle time and user A would benefit from being able to use user B's
> engines when they're idle.

Definitely.

> Another way to do it would be to have one controller with access to
> all 100 engines.  This would be truly killer since it would be as
> though you had 100 processors inside your desktop machine.  You'd
> click "View Some Complicated Plot" and 100 processors would crank away
> at generating it, returning the processed data to your desktop where
> it's dutifully plotted in a GUI window.  The guy in the next office
> would be doing the same thing, and unless you both happened to hit
> "Plot" at the same moment, you won't notice each other's presence.

Have you looked at the TaskController stuff yet?  It provides load
balanced/fault tolerant task farming capabilities alongside the
RemoteController stuff.  It would be very nice in a multi-user
setting.

> That would be incredible, and would, I think drag high end computing
> into the modern era. There's a world of difference in how you think
> about things if you're doing it interactively in real-time as opposed
> to waiting minutes or hours for the result.

> Last, an observation which is sure to categorize me as a lunatic: In
> poking around the ipython1 code, I came across several places where
> source code is laboriously manipulated as strings.  That's a heroic
> effort, but it makes me sad, because the Lisp people realized the
> usefulness of representing source code in one of the language's basic
> data structures since its inception in the 60's.  And they realized
> the usefulness of manipulating source code with the language (via
> macros) since the early 70's.  And you can add type declarations at
> will if you want the compiler to be able to do a better job optimizing
> your code.  And the compilers compile to native machine code. Those
> guys didn't have bad ideas--they just had the misfortune of being 40
> years ahead of their time.

Welcome to the lunatics club then.  This is the one thing that drives
me crazy about the whole thing = code in strings.  I would love to get
away from having to put python code in strings, but at some level, it
is a problem with Python itself.  We are working on a new version of
the core of ipython1 that uses a more sophisticated approach that
works with the AST tree directly.  But, the code is still entered by
the user and sent to the engines as a string.  As of yet I haven't
figured out a way around this problem - other than rewriting python
itself to support something like macros.

> Oh well, the world can't be perfect.

Yep.  Let us know how it goes.

Brian

> Thanks, Greg
>
> ---------------------------------------------------------------------------
> <type 'exceptions.OSError'>               Traceback (most recent call last)
>
> /home1/novak/Projects/Thesis/<ipython console> in <module>()
>
> /home1/novak/bin/local/lib/python2.5/site-packages/ipython1-0.9alpha1-py2.5.egg/ipython1/kernel/multienginexmlrpc.py
> in executeAll(self, lines, block)
>     487         See the docstring for `execute` for full details.
>     488         """
> --> 489         return self.execute('all', lines, block)
>     490
>     491     def push(self, targets, **namespace):
>
> /home1/novak/bin/local/lib/python2.5/site-packages/ipython1-0.9alpha1-py2.5.egg/ipython1/kernel/multienginexmlrpc.py
> in execute(self, targets, lines, block)
>     474         self._checkClientID()
>     475         localBlock = self._reallyBlock(block)
> --> 476         result =
> self._executeRemoteMethod(self._server.execute, self._clientID,
> localBlock, targets, lines)
>     477         if not localBlock:
>     478             result = PendingResult(self, result)
>
> /home1/novak/bin/local/lib/python2.5/site-packages/ipython1-0.9alpha1-py2.5.egg/ipython1/kernel/multienginexmlrpc.py
> in _executeRemoteMethod(self, f, *args)
>     380         try:
>     381             rawResult = f(*args)
> --> 382             result = self._unpackageResult(rawResult)
>     383         except error.InvalidClientID:
>     384             self._getClientID()
>
> /home1/novak/bin/local/lib/python2.5/site-packages/ipython1-0.9alpha1-py2.5.egg/ipython1/kernel/multienginexmlrpc.py
> in _unpackageResult(self, result)
>     389     def _unpackageResult(self, result):
>     390         result = pickle.loads(result.data)
> --> 391         return self._returnOrRaise(result)
>     392
>     393     def _returnOrRaise(self, result):
>
> /home1/novak/bin/local/lib/python2.5/site-packages/ipython1-0.9alpha1-py2.5.egg/ipython1/kernel/multienginexmlrpc.py
> in _returnOrRaise(self, result)
>     393     def _returnOrRaise(self, result):
>     394         if isinstance(result, failure.Failure):
> --> 395             result.raiseException()
>     396         else:
>     397             return result
>
> /home1/novak/bin/local/lib/python2.5/site-packages/twisted/python/failure.py
> in raiseException(self)
>     257         information if available.
>     258         """
> --> 259         raise self.type, self.value, self.tb
>     260
>     261
>
> <type 'exceptions.OSError'>:
> _______________________________________________
> IPython-user mailing list
> IPython-user@scipy.org
> http://lists.ipython.scipy.org/mailman/listinfo/ipython-user
>


More information about the IPython-user mailing list