Web Hosting Talk







View Full Version : Load-Balancing on the Cheap?


iguanalove
01-22-2003, 03:09 PM
Newbie question here, forgive me oh wise ones:

Is it possible to do load-balancing on the cheap?

Background: We have two dedicated servers (one Web, one DB) at a commercial hosting place, which ain't cheap. But high traffic is maxing out the Web server. If we get another Web server at the hosting place and load-balance there, the already-high hosting price will double.

Is it possible to load-balance to an external Web server hosted at a cheaper location, like in-house? Would it be possible to install load-balancing hardware at the dedicated host location to load-balance to an external server? (All this assumes their cooperation. Also, we host the DNS for the site already.) Or is our only real option to install another Web server at the host and load-balance there?

In short, are there any cheap ways to load-balance a professionally hosted site, or are you stuck with just paying more to the hosting provider for a much more expensive and more robust system there?

Thanks in advance...

eddy2099
01-22-2003, 04:17 PM
Although it might not be true load-balancing in the true sense but you could consider spiltting your site up and host it over several server if needed.

If you have two servers, you could set up a subdomain on the second machine and move some pages or resources over.

For instance, if you run a web site with a forum, you could move the forum over to the second machine and reference it through a sub-domain name such as forum.mydomain.com or something.

Or like some people do, call the second and subsequent machine as www2.mydomain.com or www3.mydomain.com .

If you are using an Apache Server, you could probably set up some redirects in there too.

So if you have some product description over at the 2nd server (www2.mydomain.com) for a page at say /someproduct/index.html

You could either put a hyperlink for that to http://www2.mydomain.com/someproduct/index.html

or have it set as

http://www.mydomain.com/someproduct/index.html but set up a URL redirect to http://www2.mydomain.com/someproduct/index.html .

This way you could move your resources around and it would be transparent to your clients.

We hosts sharewares and thus at times do have a substantial load from downloads, we use the redirect to divert some traffic to a remote site and remove the divert when the load dies down.

It is not actual load-balancing in the true sense but it helps to balance the load.

Hope that gives you some clues.

iguanalove
01-23-2003, 12:16 PM
Don't know why your sub-title on the forum is 'totally clueless', Eddy. Your reply did indeed give me some clues as to a new approach we hadn't even considered--offloading the many Product images on the site to an external server and calling them from there!

It's not a particularly elegant solution, but it might help mitigate some of the traffic load on the 'main' Web server and reduce bandwidth use, which are the two main concerns.

Thanks!! (BTW, additional replies still welcome...)

eddy2099
01-23-2003, 08:00 PM
Haa haa. Well, compared to the rest of the members here, I rank at the very bottom in terms of knowledge in the web hosting business and in server administration. There are still a lot of things which they find as second nature to them but quite a huge chunk of it, I still do not understand. If I were as good as any of them, I would have launched my web hosting business but no, I am not confident enough to do that as yet. I am giving myself the next 10 to 15 years to familiarize myself with the playing field before I decide.

Being clueless of all the techno-mumbo jumbo helps me look at things on a simplistic level. Besides, I've done temporary off-loading quite a number of times because I am entitled 300gb of bandwidth a month through Nocster and each time Bandmin shows I go over the 10gb/day limit, I redirect some traffic out and put them in when things falls to a more acceptable level.

In any case, I found the redirect and subdomain method used by companies such as IBM and thought that if they use it, it must be acceptable.

Hope you do find a better solution.

vguile
01-23-2003, 08:11 PM
Since you have control over the DNS, you may want to consider round-robin DNS. For example, do an nslookup or dig on cnn.com and you'll see multiple IPs come up.

I'm not an expert at it, nor have I implemented it, but I though I'd throw it out.

dandanfirema
01-23-2003, 08:30 PM
Round-robin dns is a good option, but we need to understand a little more about your site. Basically you could just make a copy of your site files and place it on the server, it would still attach to your database server for info. So as long as you keep the files uploaded to both sites and they don't change, you just modify your dns to point to both servers.

Netbridge
01-23-2003, 08:47 PM
How are you going to run round robin on a server holding a forum? Will you not get 2 different versions as what I post here will not be on the other server so only half the people see it?

vguile
01-23-2003, 09:00 PM
Definitely true. Round robin has its advantages, and, as you pointed out, its disadvantages.

But as dandanfirema pointed out, we need to know more about the site. If its mostly brochure-ware the its really a no-brainer.

BurstNET
01-24-2003, 03:12 AM
we do round-robin/clustering dns for alot of clients.
It works very well.
As long as your web server has static content...you'll be fine.
Multiple static content web servers, with round-robin/clustering dns, connecting to a single SQL server (...or two if configured for it with exact duplicate real-time data) is a VERY viable solution.

Sean R.
BurstNET
System Administration

dynamicnet
01-24-2003, 10:38 AM
Greetings:

On DNS Round Robin:

I found http://www.tek-tips.com/gfaqs.cfm/lev2/4/lev3/31/pid/333/fid/1754 to be of interest specifically at the "Round-robin DNS" section.

It did point out the following:

=== START CLIP ===

"First, load balancing is only performed once for each client at the beginning of a session.

Second, it's possible for the load balancing scheme to get a little skewed.

For instance, all the users that have been sent to MyServer1 may go to lunch while all the users who have been sent to MyServer2 continue to send requests. In this case, one server could become overloaded while another server is sitting by idly.

Third, A more significant problem with session-based load balancing is that it exposes the IP addresses of the servers in the farm to the client-side browser.

What happens when a server crashes or is taken offline? Your balancing algorithm needs to account for this as soon as possible, but doing so can be problematic. If you're passing out bad IP addresses, your users will start to receive "server not available" errors.

In a round-robin DNS system, it still can take as long as 48 hours to fix the problem once you've discovered that one of your servers has crashed.

This is due to the fact that the changes to your IP address mappings need to be propagated to DNS servers throughout the Internet."

=== END CLIP ===

http://www.vergenet.net/linux/has/html/node10.html also does a good job at reporting the problems of DNS-based methods of load balancing:

=== START CLIP ===

1. The time to live (TTL) on the zone files needs to be turned down severely to to reduce the time for which results are cached.

The longer the TTL, the less control there is over which IP addresses that end-users are accessing. The shorter that TTL, the greater the potential for congestion on the DNS server.

2. Users may access servers using an IP address rather than a host name.

3. Users may use non-DNS methods such as an /etc/hosts file to map server host names to IP addresses.

4. An additional problem with round-robin DNS is that the DNS daemon cannot differentiate between a request for a one-off hit, and a request that will result in many hits. That is, it is hard to control the granularity of scheduling.

5. When using round-robin DNS there is no way to assign weights to servers, all servers will theoretically receive the same number of requests, regardless of their resources and current load.

=== END CLIP ===

BTW, http://www.mydomain.com/help/mydomainplus/tut-robin is an ok article on this subject.

dandanfirema
01-24-2003, 10:50 AM
Round-robin is far from perfect and I have seen some very different results when a site was balanced across 3 or 4 servers. But it is cheap...

New Exposure
01-24-2003, 11:58 AM
Well... he could always set up a cron job and have the files rsync every hour or so via ssh. the server overhead shouldnt be too bad as it will only transfer the differnces between the files being rsync'd



--------------------------------------------------
How are you going to run round robin on a server holding a forum? Will you not get 2 different versions as what I post here will not be on the other server so only half the people see it?

rigor
01-25-2003, 01:31 PM
You could simply place a small squid reverse proxy server inline and round robin it. Squid does a great job at reverse proxy'ing and you can set the TTL very low. We used to do this to abusive customers to offload server farms. Just redirect the traffic to the proxy server, of course passing through non-static content with the great rulesets.

Takes about 30 minutes to setup a linux box with squid in reverse proxy from start to finish, then you can use various methods to alter its hit rate from 100% proxy head end, to round-robin dns', content type caching, etc. The proxy could be located in a totally different datacenter as well, perhaps one that gives more favorable bandwidth rates (or unlimited, lol).

iguanalove
01-28-2003, 01:00 PM
Thanks all for some very useful responses. I now have three viable options for load-balancing, two for doing it 'on the cheap' (external server to host specific files, such as images; and round-robin DNS), in addition to the 'standard' and expensive option of 'true' load-balancing (LB hw/sw and additional Web server at expensive host site).

BTW, the site is in fact an ASP-based site using MS Site Server Commerce Edition 3.0 on the Web server and a SQL 2000 DB server (both dedicated). Site changes frequently, little of the content is actually 'static' per se.

Thanks especially to the WeManageServers.com staff for the links to other info sources on the subject, in addition to the people from AlphaOmegaHosting, Vulnerabilities.org, and New Exposure.