Beowulf Cluster project

kr0sys · Feb 3, 2008

I cant seem to find the search function, might still be disabled. However, I've decided to put together a Beowulf cluster for class presentation. The topic is super computers in general, but I wanted to demonstrate a small cluster for the class. Somewhere around 8+ nodes, or as much hardware I can get my hands on. I have about 3 months to put this project together. I think that some might agree that reading out of a book feels better than reading from a computer screen. I'm in search of a good Beowulf clustering book. Any recommendations?

I understand linux, but I'm new to clustering.

I'm leaning towards a gibabit interconnect, but I dont think I can come up with the loot for all the gigabit nics. Is there a threshold of hardware specs for maxing out a 100mbit interconnect? Or would a gigabit interconnect do wonders for even a cluster running a common 500mhz cpu.

Ive read about a few popular benchmarks such as climate predictions and protien modeling. I'm feeling something more graphical, maybe something that would take advantage of the master node's graphics card to output something fancy. Raw gflops in numerical form looks great too..

thx

SippieCup · Feb 3, 2008

for what Mobo/CPU you should get..i'd say..

get a bunch of little valley boards, 75 bucks a pop + ram and HDD and you are set.

i would think it would be extremely worth it for the gigabit interconnects, as the speed you will really achive will be no where near that. as for benchmarking.. find a superpi like program and run that. nothing shows its power better than pure number crunching

secure.boy · Feb 3, 2008

what kind of jobs is going to do, i mean jobs which requests storage, or jobs
with no need of big storage?

what OS are you going to use?
Fedora / CentOS / RedHat

AMD_Gamer · Feb 3, 2008

if i have 4 computers here what kinds of stuff could i do with a Beowulf cluster? and is it easy to setup?

secure.boy · Feb 3, 2008

it;s not very hard
on the net are a lot of guides for clusters;;;
also knoppix has a ready live img i for parallel try that ...

AMD_Gamer · Feb 3, 2008

what kind of stuff could i do with a beowulf cluster if i set one up?

secure.boy · Feb 3, 2008

AMD_Gamer said:
what kind of stuff could i do with a beowulf cluster if i set one up?

what do you want to do???
you can use for scientific calculations ....

AMD_RULES · Feb 3, 2008

AMD_Gamer said:
what kind of stuff could i do with a beowulf cluster if i set one up?

1337 Bragging Rights! j/k, as the guy above stated.

AMD_Gamer · Feb 3, 2008

well thats the thing, i don't really need to simulate a hurricane or anything like that so i take it unless i am a scientist in need of some hardcore calculations i would just cluster some of my computers together and call myself badass?

SippieCup · Feb 3, 2008

AMD_Gamer said:
well thats the thing, i don't really need to simulate a hurricane or anything like that so i take it unless i am a scientist in need of some hardcore calculations i would just cluster some of my computers together and call myself badass?

start folding

kr0sys · Feb 3, 2008

One of the cs professors on campus has offered to let me use a bunch of his pc, but they are pretty old, most of them around 500mhz. I'll use this copy of centOS 5.1.

I'm not looking to make use of any storage for the benchmarks. Hows that valley board look?

AMD_Gamer · Feb 3, 2008

are there any benchmarks you could do for fun and compare youre cluster to other people?

AMD_Gamer · Feb 3, 2008

SippieCup said:
start folding

does being in a cluster help?

[Tripod]MajorPayne · Feb 3, 2008

It would provide multiple CPUs crunching a single protein, which would make it faster vs. one PC. The other argument is for multiple PCs working on multiple units at a time, sort of a parallel approach rather than a series approach.

kr0sys · Feb 3, 2008

AMD_Gamer said:
are there any benchmarks you could do for fun and compare youre cluster to other people?

http://top500.org/

there is also a list of benchmarks they used here at the end of the page:
http://top500.org/blog/2007/10/29/cluster_challange_sc07

kr0sys · Feb 3, 2008

Does every node have to be the identical in hardware specs?

For example, proc speed, ram count, motherboard, etc.. I think that a majority of these 500+/- mhz pcs are very different from one another. I cant figure that cloning a sys for another addition wouldn't workout too well. My guess is that I would have to install from scratch each different system. But the process of handling data from the master node wouldn't change or would it?

Concentric · Feb 3, 2008

It sounds to me like this project is a proof of concept and the system will just be as a demo during the presentation.

If that's the case then you don't need anything powerful and should just use the machines the professor offered you.
Assuming the systems are roughly the same spec, have one of the systems do the calculation alongside the cluster of several, to show how much faster they are.

Skud · Feb 3, 2008

I'm sure there are clusterable password crackers.

Zip a file with a "weak" password (4 or 5 characters, all numeric). You can show the cluster system running vs. a standalone machine.. I would imagine you could see the number of tries/sec on both machines and the estimated time to completion.

Riley

AMD_Gamer · Feb 4, 2008

do all the systems in the cluster have to have the same specs or can they vary greatly?

secure.boy · Feb 4, 2008

AMD_Gamer said:
do all the systems in the cluster have to have the same specs or can they vary greatly?

no

i have different architectures at work and all of them work as one cluster
but at Beowulf model they have to be same as possible

AMD_Gamer · Feb 6, 2008

secure.boy said:
no

i have different architectures at work and all of them work as one cluster
but at Beowulf model they have to be same as possible

beowulf model?

secure.boy · Feb 6, 2008

AMD_Gamer said:
beowulf model?

same as possible

kr0sys · Feb 9, 2008

So what you are saying is that I cannot have a cluster composing of 3x 700mhz p3's with 128mb ram, 5x amd 1100mhz durons with 256mb ram, and 4x 1.8ghz amd XP 2200+ based systems with 512mb of ram with a master node a intel core 2 duo 2.andsomechange ghz with 2 gb of ram?

I just made that up for reasoning. I think a similar scenario is one that I will be faced with. If I am forced to have common hardware across the whole cluster please let me know.

Just for curiosity, what would happen if a Beowulf cluster was built and attempted to run on the hardware indicated above?

AMD_RULES · Feb 9, 2008

you'd need/want to have the machines identical in specs...
check on ebay for cheap lot wholesales. Get a few cheap machines.

bassman · Feb 9, 2008

The machines don't have to be identical, but it does make life easier if you're actually going to use the thing as a computational cluster.

Most of the questions/statements in this thread make no sense at all considering you haven't said exactly what you want to run on the thing, kr0sys.

bassman · Feb 9, 2008

I guess I should make some suggestions for your demo. I would skip CentOS 5.1, install CentOS 4.6, and install the LAM-MPI binary RPM for RHEL4. Read the intro to LAM-MPI so you know how to use the compiler (mpicc) and how to run programs on the cluster (mpirun).

To get going quickly, you could try the root-mean-square image demo from the LAM tutorial: http://www.lam-mpi.org/tutorials/nd/part1/imageproc.cc

Some other simple ideas are:
- Matrix multiplication
- Image filtering (sharpening, softening, etc)

One nice thing about mpirun is you can specify how many and which nodes you'd like to use for computation. That makes it easy to show the difference in processing power between one CPU and the entire cluster.

APOLLO · Feb 9, 2008

AMD_Gamer said:
SippieCup said:

[Tripod]MajorPayne;1032012704 said:

It would provide multiple CPUs crunching a single protein, which would make it faster vs. one PC.

Click to expand...

start folding

Click to expand...

does being in a cluster help?

AFAIK, F@H does not support any kind of clustering. If it did there would be mention of it in folding forums across the 'net. Unless there's a workaround I'm not aware of.

AMD_RULES · Feb 9, 2008

would a beowulf cluster help with something such as folding?

APOLLO · Feb 9, 2008

AMD_RULES said:
would a beowulf cluster help with something such as folding?

Ask that question here: http://www.hardforum.com/forumdisplay.php?s=&daysprune=&f=32

[Tripod]MajorPayne · Feb 9, 2008

APOLLO said:
AFAIK, F@H does not support any kind of clustering. If it did there would be mention of it in folding forums across the 'net. Unless there's a workaround I'm not aware of.

You're probably right if you have any knowledge at all of F@H. I have almost zero. I was just talking about in general terms of multiprocessing where something could either be worked on in series, by all the CPU's together, or in parallel, by multiple units, one per CPU.

APOLLO · Feb 9, 2008

[Tripod]MajorPayne;1032042455 said:
You're probably right if you have any knowledge at all of F@H. I have almost zero. I was just talking about in general terms of multiprocessing where something could either be worked on in series, by all the CPU's together, or in parallel, by multiple units, one per CPU.

There's already a SMP client that runs 4simultaneous threads for dual and quad-core processing machines. However, that is not the same as clustering, which brings together multiple machines in a very fast network for aggregate parallel processing.

IIRC, a while ago there was mention of clustering in the DC forum and I believe some respondents mentioned that Stanford attempted it, but found it didn't work well with their clients. I'm not fully certain of this, so don't quote me on it. It's best to ask the question in that forum.

bassman · Feb 10, 2008

What advantage do people think they'd get running a special version of F@H on a cluster over just running separate instances on each machine? The collection of machines running F@H across the world could already be considered a massive heterogeneous cluster.

Maybe there is a reason, since it is mentioned on the F@H page:

http://folding.stanford.edu/English/FAQ-highperformance said:
Prof. Peck has been working on ways to easily run GROMACS on clusters and SMP machines in conjunction with Folding@Home.

Back on topic... kr0sys, how is the project going? I'd really like to help if I can.

APOLLO · Feb 11, 2008

bassman said:
What advantage do people think they'd get running a special version of F@H on a cluster over just running separate instances on each machine? The collection of machines running F@H across the world could already be considered a massive heterogeneous cluster.

Maybe there is a reason, since it is mentioned on the F@H page

http://www.hardforum.com/showthread.php?p=1032048008#post1032048008

bassman · Feb 11, 2008

A link to another post doesn't answer my question. What advantage do you think a cluster-ized F@H client would have over running separate instances on each machine?

I believe the cluster version would be slower. Tigerbiten provides some reasonable arguments and data to back that up in the thread you linked to.

APOLLO · Feb 11, 2008

bassman said:
A link to another post doesn't answer my question. What advantage do you think a cluster-ized F@H client would have over running separate instances on each machine?

It wasn't me that advocated clustering for F@H. I only provided additional source of information available on this forum.

I believe the cluster version would be slower. Tigerbiten provides some reasonable arguments and data to back that up in the thread you linked to.

I totally agree with him and you.

bassman · Feb 11, 2008

Oops... sorry then, Apollo. I misunderstood.

APOLLO · Feb 11, 2008

No problem.

Beowulf Cluster project

n00b

Gawd

Limp Gawd

Fully [H]

Limp Gawd

Fully [H]

Limp Gawd

2[H]4U

Fully [H]

Gawd

n00b

Fully [H]

Fully [H]

Supreme [H]ardness

n00b

n00b

[H]ard|Gawd

Gawd

Fully [H]

Limp Gawd

Fully [H]

Limp Gawd

n00b

2[H]4U

[H]ard|Gawd

[H]ard|Gawd

[H]ard|DCer of the Month - March 2009

2[H]4U

[H]ard|DCer of the Month - March 2009

Supreme [H]ardness

[H]ard|DCer of the Month - March 2009

[H]ard|Gawd

[H]ard|DCer of the Month - March 2009

[H]ard|Gawd

[H]ard|DCer of the Month - March 2009

[H]ard|Gawd

[H]ard|DCer of the Month - March 2009