[IPython-User] ipython parallel
Fri Jun 22 11:03:18 CDT 2012
When I first saw 0.12 I only explored the new notebook feature and that was a stunning experience. I have now started to look into parallel processing with ipython and I am blown away. This is really great work!!! Thanks for sharing this!!
There are a couple of things that I haven't quite figured out yet (I have read through most of the manual):
Is there an easy way to start a cluster on a server and easily connect to it from a client (I think that should be an important example in the tutorial that I didn't easily find).
Running ipcluster start on the server makes an ipcontroller that doesn't seem to listen to the outside. So what I have done so far is just run ipcontroller --ip=<myip> and then ipcluster engine. Then copying the client json file to the client and then connecting. Is there a better way to do this?
The question I have is how to push in loadbalancedview (in the documentation it says that this doesn't make sense, but I think for my problem it does). I am running many montecarlo simulations (1000s, each one is one independent job) that each need several 100 mb of exactley the same data. So my idea is to preload this on all the engines and then just access it when I need it. I would like to preload this data on all of the engines, so I can just access it as a global variable. How do I do this?
Another question is what happens if a view is closed. Do the engines clear out all of the pushed data and go into a pristine state again (that would be desirable, imho).
This is a great tool to build a small cloud for research team. I suggest making a little easy config file that contains ip adresses and number of engines to start (and connection method) that would go along with ipcluster. ipcluster would then start up engines on the specified ip addresses and build a cluster. This is just a really minor suggestion.
More information about the IPython-User