[IPython-dev] SciPy Sprint summary

Brian Granger ellisonbg@gmail....
Sun Jul 18 23:25:01 CDT 2010


Justin,

On Sun, Jul 18, 2010 at 12:43 AM, Justin Riley <justin.t.riley@gmail.com> wrote:
> Hi Satra/Brian,
>
> I modified your code to use the job array feature of SGE. I've also made it
> so that users don't need to specify --sge-script if they don't need a custom
> SGE launch script. My guess is that most users will choose not to specify
> --sge-script first and resort to using --sge-script when the generated
> launch script no longer meets their needs. More details in the git log here:

Very nice.  I will do a code review in a few minutes.

> http://github.com/jtriley/ipython/tree/0.10.1-sge
>
> Also, I need to test this, but I believe this code will fail if the folder
> containing the furl file is not NFS-mounted on the SGE cluster. Another
> option besides requiring NFS is to scp the furl file to each host as is done
> in the ssh mode of ipcluster, however, this would require password-less ssh
> to be configured properly (maybe not so bad). Another option is to dump the
> generated furl file into the job script itself. This has the advantage of
> only needing SGE installed but certainly doesn't seem like the safest
> practice. Any thoughts on how to approach this?

Currently we do assume that the user has a shared $HOME directory that
is used to propagate the furl files.  There are obviously many ways of
setting up a cluster, but this is a common approach.  I think that
should be the default.  The idea of using scp to copy the furl files
is obviously another good option and if we can make it work with the
existing approach that would be great.  Just a warning though.  The
version of ipcluster in 0.10.1 has been completely changed in 0.11 to
support multiple cluster profiles and the new configuration system.
For now the new ipcluster is based on twisted, but before we release,
I think we will get rid of Twisted.  All that to say that I don't
think it is worth putting in too much time into the ipcluster for
0.10.1.  Just enough to get it working OK with PBS and SGE is a good
target.

Brian

> Let me know what you think.
>
> ~Justin
>
> On 07/18/2010 12:05 AM, Brian Granger wrote:
>>
>> Is the array jobs feature what you want?
>>
>> http://wikis.sun.com/display/gridengine62u6/Submitting+Jobs
>>
>> Brian
>>
>> On Sat, Jul 17, 2010 at 9:00 PM, Brian Granger<ellisonbg@gmail.com>
>>  wrote:
>>>
>>> On Sat, Jul 17, 2010 at 6:23 AM, Satrajit Ghosh<satra@mit.edu>  wrote:
>>>>
>>>> hi ,
>>>>
>>>> i've pushed my changes to:
>>>>
>>>> http://github.com/satra/ipython/tree/0.10.1-sge
>>>>
>>>> notes:
>>>>
>>>> 1. it starts cleanly. i can connect and execute things. when i kill
>>>> using
>>>> ctrl-c, the messages appear to indicate that everything shut down well.
>>>> however, the sge ipengine jobs are still running.
>>>
>>> What version of Python and Twisted are you running?
>>>
>>>> 2. the pbs option appears to require mpi to be present. i don't think
>>>> one
>>>> can launch multiple engines using pbs without mpi or without the
>>>> workaround
>>>> i've applied to the sge engine. basically it submits an sge job for each
>>>> engine that i want to run. i would love to know if a single job can
>>>> launch
>>>> multiple engines on a sge/pbs cluster without mpi.
>>>
>>> I think you are right that pbs needs to use mpirun/mpiexec to start
>>> multiple engines using a single PBS job.  I am not that familiar with
>>> SGE, can you start mulitple processes without mpi and with just a
>>> single SGE job?  If so, let's try to get that working.
>>>
>>> Cheers,
>>>
>>> Brian
>>>
>>>> cheers,
>>>>
>>>> satra
>>>>
>>>> On Thu, Jul 15, 2010 at 8:55 PM, Satrajit Ghosh<satra@mit.edu>  wrote:
>>>>>
>>>>> hi justin,
>>>>>
>>>>> i hope to test it out tonight. from what fernando and i discussed, this
>>>>> should be relatively straightforward. once i'm done i'll push it to my
>>>>> fork
>>>>> of ipython and announce it here for others to test.
>>>>>
>>>>> cheers,
>>>>>
>>>>> satra
>>>>>
>>>>>
>>>>> On Thu, Jul 15, 2010 at 4:33 PM, Justin Riley<justin.t.riley@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> This is great news. Right now StarCluster just takes advantage of
>>>>>> password-less ssh already being installed and runs:
>>>>>>
>>>>>> $ ipcluster ssh --clusterfile /path/to/cluster_file.py
>>>>>>
>>>>>> This works fine for now, however, having SGE support would allow
>>>>>> ipcluster's load to be accounted for by the queue.
>>>>>>
>>>>>> Is Satra on the list? I have experience with SGE and could help with
>>>>>> the
>>>>>> code if needed. I can also help test this functionality.
>>>>>>
>>>>>> ~Justin
>>>>>>
>>>>>> On 07/15/2010 03:34 PM, Fernando Perez wrote:
>>>>>>>
>>>>>>> On Thu, Jul 15, 2010 at 10:34 AM, Brian Granger<ellisonbg@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thanks for the post.  You should also know that it looks like
>>>>>>>> someone
>>>>>>>> is going to add native SGE support to ipcluster for 0.10.1.
>>>>>>>
>>>>>>> Yes, Satra and I went over this last night in detail (thanks to Brian
>>>>>>> for the pointers), and he said he might actually already have some
>>>>>>> code for it.  I suspect we'll get this in soon.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> f
>>>>>>
>>>>>> _______________________________________________
>>>>>> IPython-dev mailing list
>>>>>> IPython-dev@scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> IPython-dev mailing list
>>>> IPython-dev@scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Brian E. Granger, Ph.D.
>>> Assistant Professor of Physics
>>> Cal Poly State University, San Luis Obispo
>>> bgranger@calpoly.edu
>>> ellisonbg@gmail.com
>>>
>>
>>
>>
>
>



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu
ellisonbg@gmail.com


More information about the IPython-dev mailing list