Web Hosting Talk







View Full Version : scyld beowulf


ClusterMania
11-27-2001, 02:03 AM
I did a search on scyld beowulf and was surprised to find that there any many mentions of it on this board. I want to make a webcluster for webhosting but not many people have much experience with me. Anyone in here ever built a cluster for webhosting? Using parallel processing for apache processes might be cool. Everytime server load gets high, just add a webserver to decrease serverload would be cool.

Foo-Dawg
11-27-2001, 06:39 AM
If you are using a Beowulf for something like simple Apache load balancing, it's often much easier to just setup 2 computers as stand alone with Apache, and then have Bind setup so all users are randomly directed to one or the other box.

ClusterMania
11-27-2001, 07:42 AM
Originally posted by Foo-Dawg
If you are using a Beowulf for something like simple Apache load balancing, it's often much easier to just setup 2 computers as stand alone with Apache, and then have Bind setup so all users are randomly directed to one or the other box.

Will everything uploaded on server one have to be cloned onto server 2? Its quite expensive to set it up like that.

Tarin
11-28-2001, 12:02 AM
FYI, I think you need to do a bit more research :) Beowulf-style clusters are totally unsuitable for webhosting operations.

I've done several types of clusters; beowulf-style (batch), failover, and load balancing. Here's the difference:

I tend to call beowulf-style clusters 'batch' processors. Typically, you take a large dataset, divide it into smaller parts, dispatch those smaller parts across the cluster, process them, then reassemble the processed parts when done to assemble the whole 'answer'. Example problems would be oil exploration and major graphics processing. For example, some of the special effects in 'Titanic' were done on a beowulf-style cluster of Alpha processors running Linux; if I recall correctly, 'Shrek' was rendered on a similar type of cluster. This is what you might use Scyld Beowulf for. This is a kind of load-balancing, but it's very much controlled and not real-time. Typically, the hardest part of this type of cluster is breaking down the problem into the smaller parts and delivering them efficiently to the nodes :)

Failover clusters typically use a shared medium to manage failovers, and are best used when you can't afford to have downtime on particular problems, especially databases. Put the database files on a shared storage array, hook two computers into them, then have software arbitrate which computer is accessing the storage at any given time. Typically, one computer is in control until it fails -- then the secondary takes over. This is why it's called a 'fail-over'. You might want to look into Linux-Failsafe, or Steeleye for this kind of cluster, if you're thinking Linux. Most major OS's have something along these lines... Typically, the hardest part of this type of cluster is pricing -- the specialized storage arrays tend to be fairly pricey.

Load-balancing clusters are really what you want for web-hosting. An example of these would be the 'Linux Virtual Server' project, or 'web switches' such as Foundry ServerIrons, Cisco LocalDirectors, etc. In the case of web-hosting centric load-balancing clusters, the cluster takes incoming connections, and distributes amongst the back-end servers. The hardest part of these servers is typically data syncronization amongst the backend servers.

Keep in mind that while the jobs are separate, the clusters can be totally complementary. A load-balanced web farm, using a fail-over database cluster, with web/traffic logs being crunched in a beowulf cluster :)

bitserve
11-28-2001, 03:56 PM
Tarin, I didn't start this thread, but wanted to thank you for taking the time to explain that to everyone.