[SciPy-dev] Re: cow: 'Connection reset by peer' timeout problem?
Simon Saubern
simon.saubern at molsci.csiro.au
Sun Feb 2 18:17:33 CST 2003
Thanks for replying Eric, but I thought that the ssh connection
didn't work under W2K? At least that was my impression from reading
the cow code.
I've actually put in an extra loop into my code to package the data
into smaller chunks and keep the reply time to under 4min. This seems
to work for calculations that take up to 4h. I can get to 10h if I
make the packages small enough to take about 1.5min. The smaller I
get the better, but then the amount of network traffic is increasing
at the same time and the proportion of time that the slaves are
spending actually doing calculations decreases.
Some of the calculations that I've run take 18-27h, which requires me
to log in from home in the wee hours to restart the calculations.
Are there any other distributed computing environments out there for
Python that run under W2K? PyMPI would require me to convince all my
colleagues to install linux on their desktop machines - which just
isn't going to happen (yet).
Cheers,
Simon
>From: "eric jones" <eric at enthought.com>
>To: <scipy-dev at scipy.net>
>Subject: RE: [SciPy-dev] cow: 'Connection reset by peer' timeout problem?
>Date: Sat, 1 Feb 2003 04:32:39 -0600
>Reply-To: scipy-dev at scipy.net
>
>Hey Simon,
>
>I don't remember ever seeing this, but it has been about a year since I
>used cow heavily. At the time, the jobs I ran lasted about 1 minute
>each, so I didn't run into the 4 minute time out you are seeing.
>
>I can't think of a technical reason why 4 minutes is a magic number from
>the Python code standpoint. There is a timeout value I believe, but it
>wouldn't cause the error you are seeing.
>
>Could it have something with ssh timing out and disconnecting?
>
>eric
>
>----------------------------------------------
>eric jones 515 Congress Ave
>www.enthought.com Suite 1614
>512 536-1057 Austin, Tx 78701
>
More information about the Scipy-dev
mailing list