Hi Min, <div><br></div><div>Thanks for taking a look - I will have to try something else then.  </div><div><br></div><div>Apologies - I didn&#39;t mean to respond only to you. I am replying back to the list, so that this is up for posterity (and google). </div>

<div><br></div><div>Thanks again, </div><div><br></div><div>Ariel <br><br><div class="gmail_quote">On Fri, Jan 27, 2012 at 3:44 PM, MinRK <span dir="ltr">&lt;<a href="mailto:benjaminrk@gmail.com">benjaminrk@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Thanks for the report.<br>
<br>
It looks like your controller is being killed (hence exit status=-9),<br>
and ipcluster knows that when you don&#39;t have a controller anymore,<br>
your engines aren&#39;t useful so they get cleaned up.  I don&#39;t know what<br>
would be killing your controller, but if, as you say, memory is<br>
skyrocketing, something could be killing it to prevent gobbling up<br>
resources.  I don&#39;t know how you would find that out.  If you are<br>
using current master, you might use `--nodb` to disable to the Hub&#39;s<br>
logging of tasks, depending on your usage.<br>
<span class="HOEnZb"><font color="#888888"><br>
-MinRK<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
On Fri, Jan 27, 2012 at 15:05, Ariel Rokem &lt;<a href="mailto:arokem@gmail.com">arokem@gmail.com</a>&gt; wrote:<br>
&gt; On Fri, Jan 27, 2012 at 12:34 PM, MinRK &lt;<a href="mailto:benjaminrk@gmail.com">benjaminrk@gmail.com</a>&gt; wrote:<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt; Can you post the entire message when the cluster is stopping? Or<br>
&gt;&gt; better yet the entire output of ipcluster adding `--debug`?<br>
&gt;&gt;<br>
&gt; celadon:~  $ipcluster start --debug<br>
&gt; [IPClusterStart] Config changed:<br>
&gt; [IPClusterStart] {&#39;Application&#39;: {&#39;log_level&#39;: 10}}<br>
&gt; [IPClusterStart] Using existing profile dir:<br>
&gt; u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; [IPClusterStart] Searching path [u&#39;/white/u6/arokem&#39;,<br>
&gt; u&#39;/home/arokem/.config/ipython/profile_default&#39;] for config files<br>
&gt; [IPClusterStart] Attempting to load config file: ipython_config.py<br>
&gt; [IPClusterStart] Loaded config file:<br>
&gt; /home/arokem/.config/ipython/profile_default/ipython_config.py<br>
&gt; [IPClusterStart] Attempting to load config file: ipcluster_config.py<br>
&gt; [IPClusterStart] Loaded config file:<br>
&gt; /home/arokem/.config/ipython/profile_default/ipcluster_config.py<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:54:58.974 [IPClusterStart] Starting ipcluster with<br>
&gt; [daemon=False]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:54:58.975 [IPClusterStart] Creating pid file:<br>
&gt; /home/arokem/.config/ipython/profile_default/pid/ipcluster.pid<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:54:58.976 [IPClusterStart] Starting Controller with<br>
&gt; LocalControllerLauncher<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:54:58.976 [IPClusterStart] Starting LocalControllerLauncher:<br>
&gt; [&#39;/usr/bin/python&#39;,<br>
&gt; u&#39;/home/arokem/usr/local/lib/python2.7/site-packages/IPython/parallel/apps/ipcontrollerapp.py&#39;,<br>
&gt; &#39;--profile-dir&#39;, u&#39;/home/arokem/.config/ipython/profile_default&#39;,<br>
&gt; &#39;--log-to-file&#39;, &#39;--log-level=20&#39;]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:54:58.981 [IPClusterStart] Process &#39;/usr/bin/python&#39; started:<br>
&gt; 24446<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:54:59.837 [IPClusterStart] [IPControllerApp] Using existing<br>
&gt; profile dir: u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:54:59.975 [IPClusterStart] Starting 8 Engines with<br>
&gt; LocalEngineSetLauncher<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:54:59.976 [IPClusterStart] Starting LocalEngineLauncher:<br>
&gt; [&#39;/usr/bin/python&#39;,<br>
&gt; u&#39;/home/arokem/usr/local/lib/python2.7/site-packages/IPython/parallel/apps/ipengineapp.py&#39;,<br>
&gt; &#39;--profile-dir&#39;, u&#39;/home/arokem/.config/ipython/profile_default&#39;,<br>
&gt; &#39;--log-to-file&#39;, &#39;--log-level=20&#39;]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:54:59.981 [IPClusterStart] Process &#39;/usr/bin/python&#39; started:<br>
&gt; 24453<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.082 [IPClusterStart] Starting LocalEngineLauncher:<br>
&gt; [&#39;/usr/bin/python&#39;,<br>
&gt; u&#39;/home/arokem/usr/local/lib/python2.7/site-packages/IPython/parallel/apps/ipengineapp.py&#39;,<br>
&gt; &#39;--profile-dir&#39;, u&#39;/home/arokem/.config/ipython/profile_default&#39;,<br>
&gt; &#39;--log-to-file&#39;, &#39;--log-level=20&#39;]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.087 [IPClusterStart] Process &#39;/usr/bin/python&#39; started:<br>
&gt; 24454<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.189 [IPClusterStart] Starting LocalEngineLauncher:<br>
&gt; [&#39;/usr/bin/python&#39;,<br>
&gt; u&#39;/home/arokem/usr/local/lib/python2.7/site-packages/IPython/parallel/apps/ipengineapp.py&#39;,<br>
&gt; &#39;--profile-dir&#39;, u&#39;/home/arokem/.config/ipython/profile_default&#39;,<br>
&gt; &#39;--log-to-file&#39;, &#39;--log-level=20&#39;]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.193 [IPClusterStart] Process &#39;/usr/bin/python&#39; started:<br>
&gt; 24467<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.294 [IPClusterStart] Starting LocalEngineLauncher:<br>
&gt; [&#39;/usr/bin/python&#39;,<br>
&gt; u&#39;/home/arokem/usr/local/lib/python2.7/site-packages/IPython/parallel/apps/ipengineapp.py&#39;,<br>
&gt; &#39;--profile-dir&#39;, u&#39;/home/arokem/.config/ipython/profile_default&#39;,<br>
&gt; &#39;--log-to-file&#39;, &#39;--log-level=20&#39;]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.298 [IPClusterStart] Process &#39;/usr/bin/python&#39; started:<br>
&gt; 24468<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.400 [IPClusterStart] Starting LocalEngineLauncher:<br>
&gt; [&#39;/usr/bin/python&#39;,<br>
&gt; u&#39;/home/arokem/usr/local/lib/python2.7/site-packages/IPython/parallel/apps/ipengineapp.py&#39;,<br>
&gt; &#39;--profile-dir&#39;, u&#39;/home/arokem/.config/ipython/profile_default&#39;,<br>
&gt; &#39;--log-to-file&#39;, &#39;--log-level=20&#39;]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.403 [IPClusterStart] Process &#39;/usr/bin/python&#39; started:<br>
&gt; 24469<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.505 [IPClusterStart] Starting LocalEngineLauncher:<br>
&gt; [&#39;/usr/bin/python&#39;,<br>
&gt; u&#39;/home/arokem/usr/local/lib/python2.7/site-packages/IPython/parallel/apps/ipengineapp.py&#39;,<br>
&gt; &#39;--profile-dir&#39;, u&#39;/home/arokem/.config/ipython/profile_default&#39;,<br>
&gt; &#39;--log-to-file&#39;, &#39;--log-level=20&#39;]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.509 [IPClusterStart] Process &#39;/usr/bin/python&#39; started:<br>
&gt; 24470<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.610 [IPClusterStart] Starting LocalEngineLauncher:<br>
&gt; [&#39;/usr/bin/python&#39;,<br>
&gt; u&#39;/home/arokem/usr/local/lib/python2.7/site-packages/IPython/parallel/apps/ipengineapp.py&#39;,<br>
&gt; &#39;--profile-dir&#39;, u&#39;/home/arokem/.config/ipython/profile_default&#39;,<br>
&gt; &#39;--log-to-file&#39;, &#39;--log-level=20&#39;]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.615 [IPClusterStart] Process &#39;/usr/bin/python&#39; started:<br>
&gt; 24471<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.716 [IPClusterStart] Starting LocalEngineLauncher:<br>
&gt; [&#39;/usr/bin/python&#39;,<br>
&gt; u&#39;/home/arokem/usr/local/lib/python2.7/site-packages/IPython/parallel/apps/ipengineapp.py&#39;,<br>
&gt; &#39;--profile-dir&#39;, u&#39;/home/arokem/.config/ipython/profile_default&#39;,<br>
&gt; &#39;--log-to-file&#39;, &#39;--log-level=20&#39;]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.721 [IPClusterStart] Process &#39;/usr/bin/python&#39; started:<br>
&gt; 24472<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.721 [IPClusterStart] Process &#39;engine set&#39; started:<br>
&gt; [None, None, None, None, None, None, None, None]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.722 [IPClusterStart] <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.122 [scheduler]<br>
&gt; Scheduler started [leastload]<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:00.961 [IPClusterStart] [IPEngineApp] Using existing<br>
&gt; profile dir: u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:01.196 [IPClusterStart] [IPEngineApp] Using existing<br>
&gt; profile dir: u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:01.357 [IPClusterStart] [IPEngineApp] Using existing<br>
&gt; profile dir: u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:01.401 [IPClusterStart] [IPEngineApp] Using existing<br>
&gt; profile dir: u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:01.511 [IPClusterStart] [IPEngineApp] Using existing<br>
&gt; profile dir: u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:01.571 [IPClusterStart] [IPEngineApp] Using existing<br>
&gt; profile dir: u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:01.666 [IPClusterStart] [IPEngineApp] Using existing<br>
&gt; profile dir: u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:01.730 [IPClusterStart] [IPEngineApp] Using existing<br>
&gt; profile dir: u&#39;/home/arokem/.config/ipython/profile_default&#39;<br>
&gt; <a href="tel:2012-01-27%2012" value="+12012012712">2012-01-27 12</a>:55:30.722 [IPClusterStart] Engines appear to have started<br>
&gt; successfully<br>
&gt; 2012-01-27 13:31:19.203 [IPClusterStart] Process &#39;/usr/bin/python&#39; stopped:<br>
&gt; {&#39;pid&#39;: 24446, &#39;exit_code&#39;: -9}<br>
&gt; 2012-01-27 13:31:20.268 [IPClusterStart] IPython cluster: stopping<br>
&gt; 2012-01-27 13:31:20.282 [IPClusterStart] Stopping Engines...<br>
&gt; 2012-01-27 13:31:20.556 [IPClusterStart] Process &#39;/usr/bin/python&#39; stopped:<br>
&gt; {&#39;pid&#39;: 24453, &#39;exit_code&#39;: -9}<br>
&gt; 2012-01-27 13:31:20.610 [IPClusterStart] Process &#39;/usr/bin/python&#39; stopped:<br>
&gt; {&#39;pid&#39;: 24454, &#39;exit_code&#39;: -9}<br>
&gt; 2012-01-27 13:31:20.625 [IPClusterStart] Process &#39;/usr/bin/python&#39; stopped:<br>
&gt; {&#39;pid&#39;: 24467, &#39;exit_code&#39;: -9}<br>
&gt; 2012-01-27 13:31:20.632 [IPClusterStart] Process &#39;/usr/bin/python&#39; stopped:<br>
&gt; {&#39;pid&#39;: 24468, &#39;exit_code&#39;: -9}<br>
&gt; 2012-01-27 13:31:23.110 [IPClusterStart] Process &#39;/usr/bin/python&#39; stopped:<br>
&gt; {&#39;pid&#39;: 24472, &#39;exit_code&#39;: -2}<br>
&gt; 2012-01-27 13:31:23.169 [IPClusterStart] Process &#39;/usr/bin/python&#39; stopped:<br>
&gt; {&#39;pid&#39;: 24469, &#39;exit_code&#39;: -2}<br>
&gt; 2012-01-27 13:31:23.222 [IPClusterStart] Process &#39;/usr/bin/python&#39; stopped:<br>
&gt; {&#39;pid&#39;: 24470, &#39;exit_code&#39;: -2}<br>
&gt; 2012-01-27 13:31:23.222 [IPClusterStart] Process &#39;/usr/bin/python&#39; stopped:<br>
&gt; {&#39;pid&#39;: 24471, &#39;exit_code&#39;: -2}<br>
&gt; 2012-01-27 13:31:23.222 [IPClusterStart] Process &#39;engine set&#39; stopped: {0:<br>
&gt; {&#39;pid&#39;: 24453, &#39;exit_code&#39;: -9}, 1: {&#39;pid&#39;: 24454, &#39;exit_code&#39;: -9}, 2:<br>
&gt; {&#39;pid&#39;: 24467, &#39;exit_code&#39;: -9}, 3: {&#39;pid&#39;: 24468, &#39;exit_code&#39;: -9}, 4:<br>
&gt; {&#39;pid&#39;: 24469, &#39;exit_code&#39;: -2}, 5: {&#39;pid&#39;: 24470, &#39;exit_code&#39;: -2}, 6:<br>
&gt; {&#39;pid&#39;: 24471, &#39;exit_code&#39;: -2}, 7: {&#39;pid&#39;: 24472, &#39;exit_code&#39;: -2}}<br>
&gt; 2012-01-27 13:31:23.702 [IPClusterStart] Removing pid file:<br>
&gt; /home/arokem/.config/ipython/profile_default/pid/ipcluster.pid<br>
&gt; celadon:~  $<br>
&gt;<br>
</div></div></blockquote></div><br></div>