[Numpy-discussion] Hardware for Monte Carlo simulation
rossini at blindglobe.net
Tue Nov 27 09:44:02 CST 2001
>>>>> "HJL" == Hung Jung Lu <hungjunglu at yahoo.com> writes:
HJL> Again, I have a tangential question. I am hitting the
HJL> physical limit of the CPU (meaning things have been optimized
HJL> down to assembly level), in order to achieve even higher
HJL> performance, the only way to go is hardware.
HJL> Is there any recommendation for fast machines at the price
HJL> range of a few thousand dollars? (I cannot afford
HJL> supercomputers or connection machines.) My purpose is to run
HJL> Monte Carlo simulation. This means that a lot of scenarios
HJL> can be run in parallel fashion. Of course I can just use
HJL> regular cheap Pentium boxes... but they are kind of bulky,
HJL> and I don't need any of the video, audio, USB features (I
HJL> think 10 machines at 1GHz each would be the size of
HJL> calculation power I need, or equivalently, a single machine
HJL> at an equivalent 10GHz. Heck, if there are some specialized
HJL> racks/boxes, I can wire the motherboards myself.) I am
HJL> wondering what you people do for heavy number crunching? Are
HJL> there any cheap yet specialized machines? What about machines
HJL> with dual processor? I would imagine a lot of people in the
HJL> number crunching world run into my situation, and since the
HJL> number crunching machines don't require much beyond a
HJL> motherboard and a small hard-drive, maybe there are already
HJL> some cheap solutions out there.
The usual way is to build some "blackboxes", i.e. mobo/cpu/memory/NIC,
diskless or nearly diskless (you don't want to maintain machines :-).
Connect them using 100bT or faster networks (though 100bT should be
Do such things exist? Sort of -- they tend to be more expensive than
building them yourself, but if you've got a reliable local supplier,
they can build them fairly cheaply for you. I'd go with single or
dual athlons, myself :-). If power and maintenance is an issue,
duals, and if not, maybe singles.
We use MOSIX (www.mosix.org) for transparent load balancing between
linux machines, and it could be used on the machines I described
(using a floppy or CD to boot).
The next question is whether some form of parallel RNG will help. The
answer is "maybe". I worked with a student who evaluated coupled
chains, and we couldn't do too much better.
And then after that, is whether you want to figure out how to
post-process the results. If you want to automate the whole thing
(and it isn't clear that it would be worth it, but...), you could use
PyPVM to front-end the sub-processes distributed on the network,
load-balanced at the system level by MOSIX.
Now for the problems -- MOSIX seems to have difficulties with Python.
Severe difficulties. I don't know if it still holds true for recent
(note that I use R (www.r-project.org) for most of my simulation work
these days, but am looking at Python for stat analyses, of which MCMC
tools are of interest).
A.J. Rossini Rsrch. Asst. Prof. of Biostatistics
U. of Washington Biostatistics rossini at u.washington.edu
FHCRC/SCHARP/HIV Vaccine Trials Net rossini at scharp.org
-------------- http://software.biostat.washington.edu/ --------------
FHCRC: M-W: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email
UW: T-Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX
Rosen: (Mullins' Lab) Fridays, and I'm unreachable except by email.
More information about the Numpy-discussion