[IPython-user] Trouble importing my own modules?

Greg Novak novak@ucolick....
Tue Jun 12 00:00:21 CDT 2007

I'm having trouble importing my own modules into the ipython1 engines.

First ipython wasn't finding them, in spite of the fact that they were
in the current working directory of the engines.  That's strange, but
not a problem--I set PYTHONPATH to include the modules in question and
the modules were found.

The other problem is that if I just do:
  import ipython1.kernel.api as par
  rc = par.RemoteController(('localhost', 10105))
  rc.executeAll('import analytic')

Then I get the traceback attached below.  However, if I do:

  [rc.execute(i, 'import analytic') for i in rc.getIDs()]

Then it seems to work.  So, I'm reasonably happy since I have a an
easy workaround, but it is strange.  This is using a CVS checkout from
May 8.

I also have several comments:
1) Thank you for putting this together.  My opinion is that high
end/parallel computing sucks these days because the whole mindset
differs from that of desktop computing.  On desktops, you have
interactive GUI programs and flexible languages (like Python).  On
high-end computers, you have non-interactive batch job systems,
laborious/difficult visualization, and Fortran.  Things like ipython1
are a breath of fresh air.

2) Is there a way to get better tracebacks?  When my code generates
exceptions, the exception is thrown in the bowels of ipython/twisted,
rather than anything indicating what was my actual mistake.  I realize
that this may be an impossible task since I pass code to be executed
as a string and after that Python has no good way of figuring out
where in the source file it came from.

3) I understand from mailing list posts that the eventual goal is to
have the engines running Ipython rather than plain python instances.
That seems fine as a default, but I'd like to put in a vote for having
it continue to be possible to run plain python instances on the
engines.  The reason is a little esoteric...
There's a program I sometimes need called GDL.  It's an interactive
data analysis program so it uses readline extensively to try to make
things easy for the user.  It can be accessed via a Python module,
which is how I actually use it.  The problem is that IPython also uses
readline extensively, and the combination causes a seg fault.  I can,
however, use the GDL python module with plain Python.

True, this is not a common situation and it's a bug, so in a perfect
world it would be fixed and I could use the GDL python module in
IPython settings.  But it actually works out perfectly that the
engines run plain python, and since they aren't meant to be directly
used interactively, it seems to me that there's some virtue in keeping
them simple.  The GDL people, for example, test their code with plain
python, not with IPython.

4) Multiple users.  Do you have any ideas or a preferred model for
allowing multiple people to connect to the same controller (and
therefore have access to the same pool of engines?)  That would be
truly killer.

To be concrete, what excites me about ipython1 is the idea of
interactive data analysis.  For the most part exploration of data is
limited to what you can do on a single processor because you need to
hook your application up to a GUI.  GUIs, being event driven, lead to
the potential of a lot of idle time since the user might have to think
about what he sees for a while before requesting more action.
Typically (in my experience) the only way to run something on a large
computer is to submit a batch job, and then it's supposed to crank
away like mad, not wait for user input via some connection to a GUI.
Therefore it's either practically or politically impossible to harness
a large number of processors for interactive data exploration.

Hence my interest in multiple users.  Let's say you have a cluster
with 100 machines and 4 users.  One way to handle this would be to
give each user a separate controller and 25 engines.  This is nice
because it insulates users from one another, and a user can be sure
that when he tells his controller to do something, there will be
engines available.  However, the downside is that if the four users
are doing interactive data exploration, then there will be a lot of
idle time and user A would benefit from being able to use user B's
engines when they're idle.

Another way to do it would be to have one controller with access to
all 100 engines.  This would be truly killer since it would be as
though you had 100 processors inside your desktop machine.  You'd
click "View Some Complicated Plot" and 100 processors would crank away
at generating it, returning the processed data to your desktop where
it's dutifully plotted in a GUI window.  The guy in the next office
would be doing the same thing, and unless you both happened to hit
"Plot" at the same moment, you won't notice each other's presence.

That would be incredible, and would, I think drag high end computing
into the modern era. There's a world of difference in how you think
about things if you're doing it interactively in real-time as opposed
to waiting minutes or hours for the result.

Last, an observation which is sure to categorize me as a lunatic: In
poking around the ipython1 code, I came across several places where
source code is laboriously manipulated as strings.  That's a heroic
effort, but it makes me sad, because the Lisp people realized the
usefulness of representing source code in one of the language's basic
data structures since its inception in the 60's.  And they realized
the usefulness of manipulating source code with the language (via
macros) since the early 70's.  And you can add type declarations at
will if you want the compiler to be able to do a better job optimizing
your code.  And the compilers compile to native machine code. Those
guys didn't have bad ideas--they just had the misfortune of being 40
years ahead of their time.

Oh well, the world can't be perfect.

Thanks, Greg

<type 'exceptions.OSError'>               Traceback (most recent call last)

/home1/novak/Projects/Thesis/<ipython console> in <module>()

in executeAll(self, lines, block)
    487         See the docstring for `execute` for full details.
    488         """
--> 489         return self.execute('all', lines, block)
    491     def push(self, targets, **namespace):

in execute(self, targets, lines, block)
    474         self._checkClientID()
    475         localBlock = self._reallyBlock(block)
--> 476         result =
self._executeRemoteMethod(self._server.execute, self._clientID,
localBlock, targets, lines)
    477         if not localBlock:
    478             result = PendingResult(self, result)

in _executeRemoteMethod(self, f, *args)
    380         try:
    381             rawResult = f(*args)
--> 382             result = self._unpackageResult(rawResult)
    383         except error.InvalidClientID:
    384             self._getClientID()

in _unpackageResult(self, result)
    389     def _unpackageResult(self, result):
    390         result = pickle.loads(result.data)
--> 391         return self._returnOrRaise(result)
    393     def _returnOrRaise(self, result):

in _returnOrRaise(self, result)
    393     def _returnOrRaise(self, result):
    394         if isinstance(result, failure.Failure):
--> 395             result.raiseException()
    396         else:
    397             return result

in raiseException(self)
    257         information if available.
    258         """
--> 259         raise self.type, self.value, self.tb

<type 'exceptions.OSError'>:

More information about the IPython-user mailing list