[IPython-dev] Before a patch for LSF support
Sun Aug 9 14:34:40 CDT 2009
2009/8/9 Brian Granger <email@example.com>:
>> I ran into another issue: on a cluster, the home folder may be
>> different than on the access box. In that case, the .ipython/security
>> does not exist and the engine will not start (I've just tested this).
> Currently our model for ipcluster is that:
> .ipython/security is shared by all hosts and in the same location. If you
> don't have this situation, you will have to manually move the .furl files
> around and tell ipengine where the .furl files are located. You will also
> need to use persistent furl files. Docs on all this are here:
With ssh-based ipcluster, I didn't need to copy the furls, as I
launched it from the host where I launched ipython as well.
> Let us know if you have other questions - this side of things can be very
> subtle. Another thing to watch out for. Some batch systems *require* the
> processes on compute nodes to call MPI_Init upon starting. This can be
> accomplished by using mpi4py. See how we do this in the mpiexec/mpirun
> versions of ipcluster. But on some system (depends on which MPI) that is
> not enough. Some systems require that the *VERY FIRST* things a process
> does is call MPI_Init. On these systems you will need to build a custom
> version of the python binary that handles this correctly. Again, mpi4py
> provides such a binary. Hopefully you won't have to deal with these things!
I hope so! I don't think LSF requires to launch MPI_Init, but first, I
have to get access to the log files (I don't understand why they were
not copied, whereas the job was submitted).
>> Also, I've tried to extract the job id (it seems it is needed), but
>> the BatchEngineSet.parse_job_id extracts everything that is matched by
>> the regexp describing a job (it uses group()). I had to put "Job
>> <(\d+)>" as a regexp, so group() returns, for instance, "Job <1234>"
>> instead of "1234". I may submit a patch to get group(1) and modify the
>> PBS regexp accordingly.
> Yes, you will *very* likely have to modify the various regexps.
Is it needed to have the exact job ID ? Perhaps to kill the job?
Information System Engineer, Ph.D.
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
More information about the IPython-dev