[IPython-dev] SciPy Sprint summary

Brian Granger ellisonbg@gmail....
Sun Jul 18 23:32:21 CDT 2010


On Sun, Jul 18, 2010 at 11:18 AM, Justin Riley <justin.t.riley@gmail.com> wrote:
> Hi Matthieu,
>
> At least for the modifications I made, no not yet. This is exactly what
> I'm asking about in the second paragraph of my response. The new SGE/PBS
> support will work with multiple hosts assuming the ~/.ipython/security
> folder is NFS-shared on the cluster.

Without mpi being required as I understand it.

> If that's not the case, then AFAIK we have two options:
>
> 1. scp the furl file from ~/.ipython/security to each host's
> ~/.ipython/security folder.
>
> 2. put the contents of the furl file directly inside the job script
> used to start the engines

This is not that bad of an idea.  Remember that the furl file the
engine uses is only between the engines and controller and this
connection is not that vulnerable.  My only question is who can see
the script?  I don't know PBS/SGE well enough to know where the script
ends up and with what permissions.

> The first option relies on the user having password-less configured
> properly to each node on the cluster. ipcluster would first need to scp
> the furl and then launch the engines using PBS/SGE.
>
> The second option is the easiest approach given that it only requires
> SGE to be installed, however, it's probably not the best idea to put the
> furl file in the job script itself for security reasons. I'm curious to
> get opinions on this. This would require slight code modifications.

Do you know anything about what SGE/PBS does with the script?  I
honestly think this might not be a bad idea.  But, again, maybe for
0.10.1 this is not worth the effort because things will change so
incredibly much with 0.11.

Brian

> ~Justin
>
> On 07/18/2010 01:13 PM, Matthieu Brucher wrote:
>> Hi,
>>
>> Does IPython support now sending engines to nodes that do not have the
>> same $HOME as the main instance? This is what kept me from testing
>> correctly IPython with LSF some months ago :|
>>
>> Matthieu
>>
>> 2010/7/18 Justin Riley<justin.t.riley@gmail.com>:
>>> Hi Satra/Brian,
>>>
>>> I modified your code to use the job array feature of SGE. I've also made
>>> it so that users don't need to specify --sge-script if they don't need a
>>> custom SGE launch script. My guess is that most users will choose not to
>>> specify --sge-script first and resort to using --sge-script when the
>>> generated launch script no longer meets their needs. More details in the
>>> git log here:
>>>
>>> http://github.com/jtriley/ipython/tree/0.10.1-sge
>>>
>>> Also, I need to test this, but I believe this code will fail if the
>>> folder containing the furl file is not NFS-mounted on the SGE cluster.
>>> Another option besides requiring NFS is to scp the furl file to each
>>> host as is done in the ssh mode of ipcluster, however, this would
>>> require password-less ssh to be configured properly (maybe not so bad).
>>> Another option is to dump the generated furl file into the job script
>>> itself. This has the advantage of only needing SGE installed but
>>> certainly doesn't seem like the safest practice. Any thoughts on how to
>>> approach this?
>>>
>>> Let me know what you think.
>>>
>>> ~Justin
>>>
>>> On 07/18/2010 12:05 AM, Brian Granger wrote:
>>>> Is the array jobs feature what you want?
>>>>
>>>> http://wikis.sun.com/display/gridengine62u6/Submitting+Jobs
>>>>
>>>> Brian
>>>>
>>>> On Sat, Jul 17, 2010 at 9:00 PM, Brian Granger<ellisonbg@gmail.com>    wrote:
>>>>> On Sat, Jul 17, 2010 at 6:23 AM, Satrajit Ghosh<satra@mit.edu>    wrote:
>>>>>> hi ,
>>>>>>
>>>>>> i've pushed my changes to:
>>>>>>
>>>>>> http://github.com/satra/ipython/tree/0.10.1-sge
>>>>>>
>>>>>> notes:
>>>>>>
>>>>>> 1. it starts cleanly. i can connect and execute things. when i kill using
>>>>>> ctrl-c, the messages appear to indicate that everything shut down well.
>>>>>> however, the sge ipengine jobs are still running.
>>>>>
>>>>> What version of Python and Twisted are you running?
>>>>>
>>>>>> 2. the pbs option appears to require mpi to be present. i don't think one
>>>>>> can launch multiple engines using pbs without mpi or without the workaround
>>>>>> i've applied to the sge engine. basically it submits an sge job for each
>>>>>> engine that i want to run. i would love to know if a single job can launch
>>>>>> multiple engines on a sge/pbs cluster without mpi.
>>>>>
>>>>> I think you are right that pbs needs to use mpirun/mpiexec to start
>>>>> multiple engines using a single PBS job.  I am not that familiar with
>>>>> SGE, can you start mulitple processes without mpi and with just a
>>>>> single SGE job?  If so, let's try to get that working.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Brian
>>>>>
>>>>>> cheers,
>>>>>>
>>>>>> satra
>>>>>>
>>>>>> On Thu, Jul 15, 2010 at 8:55 PM, Satrajit Ghosh<satra@mit.edu>    wrote:
>>>>>>>
>>>>>>> hi justin,
>>>>>>>
>>>>>>> i hope to test it out tonight. from what fernando and i discussed, this
>>>>>>> should be relatively straightforward. once i'm done i'll push it to my fork
>>>>>>> of ipython and announce it here for others to test.
>>>>>>>
>>>>>>> cheers,
>>>>>>>
>>>>>>> satra
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 15, 2010 at 4:33 PM, Justin Riley<justin.t.riley@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> This is great news. Right now StarCluster just takes advantage of
>>>>>>>> password-less ssh already being installed and runs:
>>>>>>>>
>>>>>>>> $ ipcluster ssh --clusterfile /path/to/cluster_file.py
>>>>>>>>
>>>>>>>> This works fine for now, however, having SGE support would allow
>>>>>>>> ipcluster's load to be accounted for by the queue.
>>>>>>>>
>>>>>>>> Is Satra on the list? I have experience with SGE and could help with the
>>>>>>>> code if needed. I can also help test this functionality.
>>>>>>>>
>>>>>>>> ~Justin
>>>>>>>>
>>>>>>>> On 07/15/2010 03:34 PM, Fernando Perez wrote:
>>>>>>>>> On Thu, Jul 15, 2010 at 10:34 AM, Brian Granger<ellisonbg@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>> Thanks for the post.  You should also know that it looks like someone
>>>>>>>>>> is going to add native SGE support to ipcluster for 0.10.1.
>>>>>>>>>
>>>>>>>>> Yes, Satra and I went over this last night in detail (thanks to Brian
>>>>>>>>> for the pointers), and he said he might actually already have some
>>>>>>>>> code for it.  I suspect we'll get this in soon.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> f
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> IPython-dev mailing list
>>>>>>>> IPython-dev@scipy.org
>>>>>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> IPython-dev mailing list
>>>>>> IPython-dev@scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Brian E. Granger, Ph.D.
>>>>> Assistant Professor of Physics
>>>>> Cal Poly State University, San Luis Obispo
>>>>> bgranger@calpoly.edu
>>>>> ellisonbg@gmail.com
>>>>>
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> IPython-dev mailing list
>>> IPython-dev@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>
>>
>>
>>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu
ellisonbg@gmail.com


More information about the IPython-dev mailing list