<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div></div><div><br></div></div><blockquote class="webkit-indent-blockquote" style="margin: 0 0 0 40px; border: none; padding: 0px;"><div><div><div>USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND</div></div></div><div><div><div>petigura 56013 100.0 0.3 2634324 53768 s003 R+ 10:26PM 0:35.62 python val2134.py</div></div></div><div><div><div>petigura 55962 99.0 0.3 2652864 55496 s003 R+ 10:26PM 1:13.14 python val2140.py</div></div></div><div><div><div>petigura 56025 99.0 0.3 2635648 53692 s003 R+ 10:26PM 0:28.09 python val2139.py</div></div></div><div><div><div>petigura 55812 98.5 0.3 2653816 62736 s003 R+ 10:24PM 2:36.85 python val2135.py</div></div></div><div><div><div>petigura 38665 22.6 0.5 2699096 99376 s002 R+ 12:17PM 82:11.48 python ipython --pylab</div></div></div><div><div><div>petigura 44579 0.3 0.2 2559724 33472 s003 S+ 3:30PM 2:15.77 python ipcluster start --n=8</div></div></div><div><div><div>petigura 44584 0.1 0.3 2643632 61900 s003 S+ 3:30PM 1:07.71 python ipcontrollerapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 53491 0.0 0.0 2666688 432 s003 S+ 9:17PM 0:00.00 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44596 0.0 0.3 2666688 55640 s003 S+ 3:30PM 0:06.63 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44595 0.0 0.3 2665664 55680 s003 S+ 3:30PM 0:06.88 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44594 0.0 0.3 2666688 55636 s003 S+ 3:30PM 0:07.32 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44593 0.0 0.3 2666688 55676 s003 S+ 3:30PM 0:07.19 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44592 0.0 0.3 2664640 55668 s003 S+ 3:30PM 0:07.60 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44591 0.0 0.3 2665664 55776 s003 S+ 3:30PM 0:07.96 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44590 0.0 0.3 2665664 55680 s003 S+ 3:30PM 0:07.72 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44589 0.0 0.3 2664640 55676 s003 S+ 3:30PM 0:08.31 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44588 0.0 0.2 2635272 39724 s003 S+ 3:30PM 0:25.99 python ipcontrollerapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44587 0.0 0.0 2623100 2844 s003 S+ 3:30PM 0:00.01 python ipcontrollerapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44586 0.0 0.0 2623100 2708 s003 S+ 3:30PM 0:00.01 python ipcontrollerapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 44585 0.0 0.0 2614908 2752 s003 S+ 3:30PM 0:00.01 python ipcontrollerapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 56024 0.0 0.0 2435544 808 s003 S+ 10:26PM 0:00.01 /bin/sh -c python val2139.py > val2139.log</div></div></div><div><div><div>petigura 56012 0.0 0.0 2435544 808 s003 S+ 10:26PM 0:00.01 /bin/sh -c python val2134.py > val2134.log</div></div></div><div><div><div>petigura 55961 0.0 0.0 2435544 808 s003 S+ 10:26PM 0:00.01 /bin/sh -c python val2140.py > val2140.log</div></div></div><div><div><div>petigura 55811 0.0 0.0 2435544 808 s003 S+ 10:24PM 0:00.01 /bin/sh -c python val2135.py > val2135.log</div></div></div><div><div><div>petigura 53728 0.0 0.0 2666688 428 s003 S+ 9:31PM 0:00.00 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 53673 0.0 0.0 2665664 420 s003 S+ 9:27PM 0:00.00 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div><div><div><div>petigura 53670 0.0 0.0 2665664 432 s003 S+ 9:27PM 0:00.00 python ipengineapp.py --profile-dir /Users/petigura/.ipython/profile_default --log-to-file --log-level=20</div></div></div></blockquote><div><div><div><br></div><div>Here are some observations:</div><div><br></div></div></div><blockquote class="webkit-indent-blockquote" style="margin: 0 0 0 40px; border: none; padding: 0px;"><div><div>1. 8 instances of ipengineapp.py were started when I started my jobs at 3:30pm. </div></div><div><div>2. Around 9:30pm, 4 of the cores stopped working and 4 new instances of ipengineapp.py were started.</div></div><div><div>3. Now only 4 cores were working. </div></div></blockquote><div><div><br></div><div>What exactly does the heartbeat do? Why would a engine work for many hours before dropping out?</div><div><br></div><div>Thanks,</div><div><br></div><div>Erik</div><div><br></div><div><br></div></div><div><br></div></body></html>