[IPython-User] Question about schedulers
Darren Govoni
darren@ontrenet....
Wed Jun 6 20:38:05 CDT 2012
Gotcha. Makes sense.
Incidentally, I discovered that I can execute the ipengine code directly
in my python IDE and set break points in my user code modules and when I
execute functions from remote clients/views, it will hit the break
points and let me debug my code visually (in the running engine). Pretty
sweet. Though I'd share.
On Wed, 2012-06-06 at 17:06 -0700, MinRK wrote:
>
>
> On Wed, Jun 6, 2012 at 4:52 PM, Darren Govoni <darren@ontrenet.com>
> wrote:
> Jon,
> Thanks for those details. Very informative.
>
> So it says multiple tasks can be assigned to an engine at a
> time, but
> how many execute at the same time? Just one right? Or is there
> a setting
> for that too?
>
>
> Correct, the engines themselves are not multithreaded, so it only runs
> one at a time. This is not configurable. The normal mode is starting
> one engine per core on each machine.
>
>
> Assigning multiple tasks to the engines helps hide the network latency
> behind computation, because the next task will be waiting in-memory on
> the Engine when it finishes the previous one, rather than having to
> fetch it from the scheduler.
>
>
> -MinRK
>
>
> thanks!
> Darren
>
> On Wed, 2012-06-06 at 21:38 +0000, Jon Olav Vik wrote:
> > Darren Govoni <darren <at> ontrenet.com> writes:
> >
> > > Assuming all engines are equal, will the first 10 objects
> be
> > > distributed to 1 engine each and the second 10 objects
> will wait for an
> > > engine to be free then go there? Or will all 20 messages
> be spread to
> > > the engines at the same time?
> >
> > I think two relevant options are:
> >
> >
> > The `chunksize` argument to
> IPython.parallel.ParallelFunction determines how
> > many list items are passed in each "task".
> >
> > from IPython.parallel import Client
> > c = Client()
> > lv = c.load_balanced_view()
> >
> > @lv.parallel(block=True)
> > def chunk1(x):
> > return str(x)
> >
> > @lv.parallel(chunksize=2, block=True)
> > def chunk2(x):
> > return str(x)
> >
> > L = range(5)
> > print chunk1(L)
> > print chunk2(L)
> > ## -- End pasted text --
> > ['[0]', '[1]', '[2]', '[3]', '[4]']
> > ['[0, 1]', '[2, 3]', '[4]']
> >
> >
> > The `hwm` (high water mark) configurable determines the
> maximum number of tasks
> > that can be outstanding on an engine. On my system, it is
> set in the file
> > ipcontroller_config.py, inside the directory profile_default
> inside the
> > directory returned by IPython.utils.path.get_ipython_dir().
> >
> > Quoting
> >
> http://ipython.org/ipython-doc/dev/parallel/parallel_task.html#greedy-assignment
> >
> > """
> > Tasks are assigned greedily as they are submitted. If their
> dependencies are
> > met, they will be assigned to an engine right away, and
> multiple tasks can be
> > assigned to an engine at a given time. This limit is set
> with the
> > TaskScheduler.hwm (high water mark) configurable:
> > # the most common choices are:
> > c.TaskSheduler.hwm = 0 # (minimal latency, default in
> IPython ≤ 0.12)
> > # or
> > c.TaskScheduler.hwm = 1 # (most-informed balancing, default
> in > 0.12)
> >
> > In IPython ≤ 0.12,the default is 0, or no-limit. That is,
> there is no limit to
> > the number of tasks that can be outstanding on a given
> engine. This greatly
> > benefits the latency of execution, because network traffic
> can be hidden behind
> > computation. However, this means that workload is assigned
> without knowledge of
> > how long each task might take, and can result in poor
> load-balancing,
> > particularly for submitting a collection of heterogeneous
> tasks all at once.
> > You can limit this effect by setting hwm to a positive
> integer, 1 being maximum
> > load-balancing (a task will never be waiting if there is an
> idle engine), and
> > any larger number being a compromise between load-balance
> and latency-hiding.
> >
> > In practice, some users have been confused by having this
> optimization on by
> > default, and the default value has been changed to 1. This
> can be slower, but
> > has more obvious behavior and won’t result in assigning too
> many tasks to some
> > engines in heterogeneous cases.
> > """
> >
> > _______________________________________________
> > IPython-User mailing list
> > IPython-User@scipy.org
> > http://mail.scipy.org/mailman/listinfo/ipython-user
>
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
More information about the IPython-User
mailing list