Web Hosting Talk







View Full Version : Is Fibre SAN the ONLY solution for clustering application servers?


Circa3000
09-07-2001, 04:45 AM
Hi guys,

Do application servers require an SAN to cluster/load-balance?

[I seem to have stumped my peers on the ColdFusion forums last month with this one. Perhaps a broader audience can help.]

To clarify (and generalize - I won't limit this discussion to ColdFusion, PHP or ASP servers), we're looking to cluster/load-balance our shared server farm. However, the majority of our customers' web sites are not static. They feature dynamic and database-driven content. Sure, we can offload the databases to clustered database servers, but that doesn't solve all of those customer-built applications that write files, for example. If we are to cluster these servers, then those file writes must be mirrored across each cluster member in real-time. Otherwise, server content will quickly go out of sync.

I understand that two servers can share a single SCSI RAID enclosure, but that's insufficient. We'd like to add any number of servers to our cluster, with each of them sharing the same virtual disk. Is a fibre-channel (or copper, for that matter) SAN the only solution?

Any advice is greatly appreciated.

huck
09-07-2001, 08:02 AM
I am not exactly clear about your implementation or what OS you are running. However, I now some people who have used dual 100MB ethernet connections to a central beefed up database. The config is like the one below:


web server farm (5 machines) --100MB lines ---> app server (3 machines) --100MB--> database servers (2 machines) -->intranet


This is all on linux (expect the databases) with appropriate firewalls etc. Basically the front five boxes are setup to load balance web traffic (one is a proxy server 2GB of RAM, on servers up images on a lightweight thttpd server). The app servers run mod_perl code soon to be jsp. The databases are Oracle boxes on Sun Hardware. The connections between all machines are through dual 100MB ethernet/machine for a total of 200MB full duplex per machine.

I don't know why you could not connect your farm just using ethernet -- if 100MB pipes are not big enough, then use gigabit ones. But in the config above we have a total of 1GB going between the app servers and web servers then there is 600MB total pipe between the app servers and the database.


I did not set this up, so I don't know the details. But it is a nice configuration because the machines controlling the databases are completely seperate from the app and web server farms. This is what is allowing us to switch from the mod_perl code to jsp with minimal changes on the database backend.

Circa3000
09-07-2001, 03:32 PM
Thank you, Huck.

Actually, we're implementing Windows 2000 servers, though it shouldn't matter.

If I understand your solution, the front-end web servers are clustered, but are the three supporting application servers clustered as well? If so, where do they store their data? File server? NAS? SAN? Or, does some sort of software mirror content across the clustered application servers in real-time?

This is where we're having our difficulty. Clustering static web sites is a no-brainer, as is deploying a beefed-up database server, but in order for those three application servers to mirror content changes, the data must be centralized (or synced in real-time).

Here's a good example: Imagine that one of your hosting customers runs an MP3 music-swapping library (all perfectly legal, of course). If application server #1 receives a music file upload, renames it, watermarks it, and performs other various file operations, then the other application servers must mirror those actions immediately. Otherwise, load-balancing will put the next visitor to server #2, which doesn't recognize the new music file at all. Sure the file name may be cataloged in a database, but when server #2 or server #3 attempt to retrieve the music file from their local drive, it is not there.

I understand that using a file server to drive a cluster can quickly overwhelm the file server with requests from the web/application servers. Apparently, SANs overcome this limitation by bypassing the operating system, allowing the remote storage to perform virtually as a local drive. (Of course, 1Gb+ network speeds don't hurt either.)

So, I'm very curious to know how those three application servers are configured.

Again, your help is greatly appreciated.