[IPython-User] some questions about parallel computing in 0.11

MinRK benjaminrk@gmail....
Sat Aug 6 18:32:11 CDT 2011


On Sat, Aug 6, 2011 at 15:41, Maarten Derickx
<m.derickx.student@gmail.com>wrote:

> I was just trying to turn some ideas I had for parallelizing code into
> reality using the new IPython parallel interface. During this I ran
> into some questions.
>
> 1.  I want to stop already running calculations (and I don't know at
> the time that I start the computation if I want to abort it or after
> how long I want to abort it so I cannot use timouts). But as far as I
> see IPython can only abort calculations which are queued, but not yet
> running. See
> http://pastebin.com/VZTU9bR7
> for an example where I use .abort on an asynchronous result on an
> already running job.
> I wonder if it's possible to stop already running jobs. I know it's

possible to shutdown engines even when they have a job, but then you
> have to be able to start new ones which brings me to
>
>
Not presently.  shutdown is handled as a regular Control message, so it will
wait for the current execution to finish before it is noticed.  We do hope
to add
interrupt, etc. support, but that will require running another process
adjacent to the engine
that intercepts control messages.  Far from impossible, but made more
difficult by MPI
restrictions that require the work-process be the first one started, so the
Engine has
to start what is essentially its parent as its child. It also has to be
optional, to allow the
Engines to be run in draconian cluster environments that don't allow extra
processes.

For now, if you are running engines locally, you can send interrupts from
the client.

Just do this, first thing:

pids = rc[:].apply_async(os.getpid).get_dict()
# {0: 48388, 1: 48386, 2: 48387, 3: 48385}

Then when you want to interrupt a running calculation, just do:

os.kill(pids[engine_id], signal.SIGINT)



> 2. The second thing I was wondering about and could not find in the
> documentation, is if there is an easy way to do something equivalent
> to "ipcluster start --n=4" and "ipengine" in python?
>

Well, there's always `os.system("ipengine")` :)

But in all seriousness, you can use the Launchers that ipcluster itself
uses:

from IPython.parallel.apps import launcher

el = launcher.LocalEngineLauncher()
el.start("/Users/you/.ipython/profile_default")

(start takes a profile_dir as an option, so pass whatever makes sense.  The
default behavior would be `IPYTHON_DIR/profile_default`.)

And there's EngineSetLaunchers for launching more than one, MPI launchers
for using MPIExec, SGE, PBS, etc.

`ipcluster start` just starts one ControllerLauncher and one
EngineSetLauncher. `ipcluster engines` is the same, but skips the
Controller.

These are very simple wrappers for subprocess.Popen.


> 3. I found the following session quite unexpected, see
> http://pastebin.com/EkxAArSw
> The cluster was started with "ipcluster start --n=4" then I shot down
> an engine while it was doing a computation and after that I cannot
> start new computations
> note that  c.shutdown(0) was done more that 10 seconds after the
> previous command and that  c.shutdown(1) was done less then 10 seconds
> after the previous command. The error in "In [34]" is ok, but the two
> error's in "In [35]" and "In [36]" should not happen I guess.
>

There's definitely a bug, but there is an issue with your code.  A
DirectView's targets attribute is set at the time of construction.  So:

dv = rc[:]

gets a DirectView with all of the engines that are attached *at the time of
the call*.  It is not a lazily evaluated 'all targets', which follows
engines coming and going.  That means that executions from dv after one of
its targets is gone will still try to run in the dead location, and you need
to update the targets of dv, or create a new view.

If you do set dv.targets='all', then it will be lazily evaluated at each
execution, and kept up to date.  rc.direct_view('all') *should* do this, but
it doesn't*

* it does now, as of a minute ago.

-MinRK


> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20110806/dc15691f/attachment-0001.html 


More information about the IPython-User mailing list