Web Hosting Talk







View Full Version : Looking for geo distributed host - cost is no issue


rockstar82
01-25-2010, 07:13 PM
Hi there, I have a website that gets visitors from US, Asia and Europe, its a fairly simple website on drupal- no video etc.

I do do a lot of marketing in these different countries, and seeing as load time has a large influence on performance I would like to look at a geographically distributed hosting provider.

The site gets something like 50K visitors a month, so its very small.. however, cost is not an issue, I really want the network with the best distribution, lowest load times...fastest servers - with a reputation suitable for a fortune 500 company. Secondly, all data needs to be completely secure (frequent backups, firewalls against attacks or whatever its called) and there needs to be a 100% uptime guarantee.

What would you recommend?

DedicatedXL
01-25-2010, 09:49 PM
I would get a dedicated server or VPS in Europe, Asia and the US yourself.

There are several hosting companies which can at least set you up within two of the three regions, however it would be hard I guess to find one who services all three areas. We only do Europe and the US for example.

rockstar82
01-26-2010, 07:39 AM
So US and Asia would be fine, Europe has pretty good connections anyway.

I am no IT person, so perhaps you can explain if having two servers could still mean only one version of the site (one place to upload and change it).

Do you know a company that is as described, and is full service (ie I want to talk to a real person when we have problems)?

cristibighea
01-26-2010, 07:47 AM
You're looking at paying quite a bit of cash for this sort of setup. You can keep uploading to only one server, and replicated all the changes to the other servers by keeping files in sync with rsync and your MySQL database with replication. You could then redirect users based on their IP address using geodns, however this is not 100% accurate, but it will get the job done without a huge cost.

Based on your description, I don't think you actually need geographical distribution, since the speed will be decent from all over the world considering you have no heavy media to serve. You could accelerate some simple things ( your CSS and JS files, and maybe images ) through a content delivery network and just have a single server somewhere in the US, in a redundant setup. The complexity of maintaining two servers in sync is gone in this case, as the need for geodns.

rockstar82
01-26-2010, 08:12 AM
Ok, thanks for the replies.

After doing a little research on CDN, I think that might just be what we need (but deliver the whole site through CDN, not just JS etc). Is there a company that provides both the seed server and the CDN and manages the whole thing?

Like said, cost is not really an issue.

eming
01-26-2010, 08:18 AM
Ok, thanks for the replies.

After doing a little research on CDN, I think that might just be what we need (but deliver the whole site through CDN, not just JS etc). Is there a company that provides both the seed server and the CDN and manages the whole thing?

Like said, cost is not really an issue.

something like this: http://ditlev.dk/snitch/Dock-20100126-121704.jpg should give you a healthy mix of redundancy, scalability and flexibility.
Most Cloud suppliers can do this, and some are able to add in CDN in the mix - typically via reseller arrangements with Highwinds/Edgecast/Limelight/AKAMAI

:)
D

phil_virtualbright
01-26-2010, 08:45 AM
Hi RockStar,

It is brave to say that cost is not an issue! There are people who will consult on this for you and get a solution in place, is this going to work? It is quite complicated to do easily, but it is possible. Other technologies to look at are BGP anycast and others.

rockstar82
01-26-2010, 08:54 AM
Working with an intermediary that already has blue chip clients is fine, as long as that company then also bills us every month/year and is responsible for its implementation. I dont want to pay just for advice on what companies to work with, that's what I have you guys for ;).

cristibighea
01-26-2010, 09:07 AM
CDNs are mostly used for static content, as it is extremely difficult to replicate your database across hundreds of nodes worldwide in order to achieve the low-latency environment that a CDN usually offers for streaming video.

Another pointer, since I failed to mention it earlier: firewalls won't stop a denial of service attack against your website, so you will have to make sure that your host of choice has the possibility to block these attacks before they get to your server. We're using a TopLayer solution for this aspect, but there are solution from Cisco as well as RioRey.

A guarantee of uptime usually means nothing to people that need 99.999% uptime or better. Getting a small amount of money back for an hour of downtime will not even scratch the surface when it comes to the possible lost revenue during that hour of downtime, so I'd advise in trying someone with a proven track record instead of just a guarantee.

rockstar82
01-26-2010, 12:55 PM
"CDNs are mostly used for static content, as it is extremely difficult to replicate your database across hundreds of nodes worldwide in order to achieve the low-latency environment that a CDN usually offers for streaming video."

Ok, then back to the one server in US, one in EU and one in Asia idea. Is there a provider that could offer such a service that is automatically synched?

phil_virtualbright
01-26-2010, 01:00 PM
That is only one side of the coin, the other side is how do you make sure US visitors go to the US server, EU to the EU, Asian to the Asia one. Do you want different URLs like usa.domain.com eu.domain.com asia.domain.com or do you need more sophistication? Geo detection is hard to do 100% accurately, how much trouble will it cause if a small percentage of visitors go to the wrong physical server?

claptic
01-26-2010, 02:28 PM
CDNs (content delivery networks) usually deliver audio and video media. But it sounded like you were looking for just web delivery.

rockstar82
01-26-2010, 05:03 PM
That is only one side of the coin, the other side is how do you make sure US visitors go to the US server, EU to the EU, Asian to the Asia one. Do you want different URLs like usa.domain.com eu.domain.com asia.domain.com or do you need more sophistication? Geo detection is hard to do 100% accurately, how much trouble will it cause if a small percentage of visitors go to the wrong physical server?

Actually, we have domains like domain.com/region, would that also work?

Any ideas on who to talk to? Would a company like rackspace be able to help?

phil_virtualbright
01-27-2010, 05:13 AM
That would work, yes. It makes it much easier in fact. I don't think RackSpace will be able to help much, they would probably want to charge far too much.

rockstar82
01-27-2010, 06:05 AM
OK, so if it's much easier, any practical advise? I would like to work with a company *like* rackspace, they seem to have lots of blue chip customers... Thanks much for all the advice everyone!

phil_virtualbright
01-27-2010, 06:52 AM
Practical advice would be to get your requirements down and make contact with a few hosting companies. We are not allowed to self-promote on here so you will actually need to email or talk to them directly.

lockbull
01-27-2010, 11:30 PM
It is brave to say that cost is not an issue! There are people who will consult on this for you and get a solution in place, is this going to work? It is quite complicated to do easily, but it is possible. Other technologies to look at are BGP anycast and others.

That would work, yes. It makes it much easier in fact. I don't think RackSpace will be able to help much, they would probably want to charge far too much.

Any company doing BGP based GSLB is going to be way more expensive than Rackspace--there are very few companies of the size, with the requisite technical knowledge and a single integrated ASN for all desired locations, and they're not going to be cheap.

lockbull
01-27-2010, 11:47 PM
How often are updates being done to your database, and how critical is it that viewers see the absolute latest content? You can certainly cache some dynamically generated pages with a CDN (who generally call this "whole site delivery", as opposed to video or just static elements like CSS or image files), especially those that don't involve any personalization on a per user basis, using cache control directives (max-age, last-modified, etc.) in the HTTP headers. Say for example your homepage doesn't need to reflect updates within a second or whatever, and you can set your max-age value to 5-15 minutes. That can be easily cached by the CDN at the edge, and aside from the speed advantage, whole site delivery also makes your site much more resilient to high traffic loads and origin server outages. Even highly dynamic pages generally have some cacheable elements, which can be cached by some CDNs using what's called ESI (Edge Side Includes).

The easiest CDN implementation is going to be a simple CNAME/reverse-proxy setup, which is what many CDNs (though not all) use. Generally with most CDNs, you're on your own with regards to maintaining your origin servers, though many hosting companies now resell CDN services. I know Internap does managed servers at a few of their locations, primarily for their CDN customers, so you might check them out. Rackspace resells the Limelight CDN, so they should be someone you should talk to you. I would probably initially look at a redundant origin server setup first in a single datacenter, and then look into doing replication to two different datacenters at some point in the future if your needs warrant it.

phil_virtualbright
01-28-2010, 04:43 AM
Any company doing BGP based GSLB is going to be way more expensive than Rackspace--there are very few companies of the size, with the requisite technical knowledge and a single integrated ASN for all desired locations, and they're not going to be cheap.

I wasn't saying that it had to be done by anycast, just a suggestion of one of the technologies to look at. For most situations it isn't the right technology to use, there are more effective and cost-efficient alternatives.

network82
01-28-2010, 01:58 PM
I'm not aware of any provider that does this, but I have just finnished a project that is fairly similar to this, a SaaS based business automation system, that distributes content (anything files/video/docs) from a CDN nearest to the end user geo location.

The CDN replication isn't that complex, but i did have to develop my own DNS forwarder application, that forwarded DNS based on IP Geo location, so everything could be done under a generic URL regardless of where the content/server was coming from.

rockstar82
01-28-2010, 07:36 PM
Haha, guys you lost me at 'BGP based GSLB'.

Can you explain what you think would be the best solution like I'm a three year old? ;)

itpnet
03-06-2010, 04:11 PM
Get 2 servers in the following areas: USA West, USA East, Europe and possibly Asia/Australia depending on where your clients come from.

Then you need to be able to send visitors to the right server depending on their location. For this, I would install a Powerdns DNS server on each server with IP GEO.

Some more info on a setup like this: http://wiki.blitzed.org/DNS_balancing

plumsauce
03-06-2010, 04:41 PM
You're looking at paying quite a bit of cash for this sort of setup. You can keep uploading to only one server, and replicated all the changes to the other servers by keeping files in sync with rsync and your MySQL database with replication. You could then redirect users based on their IP address using geodns, however this is not 100% accurate, but it will get the job done without a huge cost.


Some people would differ with you on this one with respect to costs.

I know of a similar operation who have a couple of budget vps's at two data centers. They use geodns+failover to flip back and forth between them.

Even after a suggestion that he use his budget to consolidate onto a single higher end vps, he still went with the multiple vps + geodns setup.

He's happy as a clam.

wheimeng
04-14-2010, 02:04 AM
I think Christ has made it pretty clear that CDN can speed up images/css/html/js, but database, it's a tough one and sync'ng data across continent gonna be tough.

your best bet right now would probably get a CDN and a redundant pair for your databases.

network82
04-14-2010, 05:43 AM
I use MSSQL2008 and have been playing around with SQL P2P replication.
It seems to work well, but will only replicate in one direction, you still have to write data to the primary SQL Server, and wait a couple of seconds (depending on how large the dataset) for it to replication to the SQL Server in your required location.

These types of technologies really put you to the test on how good of a developer you are. But any decent developer used to web2.0 technologies shouldn't have a problem.

For something like Geo-Distribution, you can easily get away with the time-2-live data lag between replications. There are other ways to replication your database without using complex technologies like that, and that would to simply repeat Data Writes to x number of database Servers, but your basically shifting the processing load to the front end rather then in backend infrastructure.

The primary issue for us is File syncing.... We had inittially played around with Samba (smb://) replication which seemed to work but is a little slow, but have since started testing Windows 2008 R2 DFS designed for multi-site file access, and run our own code ontop to initiate file synchronisation every time a specific filestructure changes, rather then every time the file is requested. I'll see what happens with that and post my findings.

Depending on the type of content though, with minimal file changes, your better off using something like Amazon S3 for file storage.

mattdahack
04-26-2010, 02:49 PM
I don't think you need CDN either. Rsync is good for most ppl.

plumsauce
05-25-2010, 09:38 PM
I use MSSQL2008 and have been playing around with SQL P2P replication.
It seems to work well, but will only replicate in one direction, you still have to write data to the primary SQL Server, and wait a couple of seconds (depending on how large the dataset) for it to replication to the SQL Server in your required location.


I suggest you look at the replication docs again because it is definitely possible to use transactionb based replication that is not uni-directional.


These types of technologies really put you to the test on how good of a developer you are. But any decent developer used to web2.0 technologies shouldn't have a problem.


No, it is the members of the brotherhood of ancient coders of the realm who have a complete command of moving data around in sync. They work at places like telcos, banks, brokerage houses, post offices, casinos ... any place where the data has a real dollars and cents value.

ibelledthecat
05-27-2010, 08:32 AM
I think what you are looking for is something similar to this : http://www.nedproductions.biz/wiki/setting-up-a-content-caching-server

The link above is a tutorial but the guy who wrote it also offers the service. If you want it on a larger scale, you should contact him as he should be able to set it up for you.

tronkle
05-28-2010, 06:32 AM
I think what you are looking for is something similar to this : http://www.nedproductions.biz/wiki/setting-up-a-content-caching-server

The link above is a tutorial but the guy who wrote it also offers the service. If you want it on a larger scale, you should contact him as he should be able to set it up for you.

Varnish is a "high-performance HTTP accelerator" or in other words a "reverse proxy". It can be used to implement a pseudo-geo-distributed website by running multiple instances in several locations.

While this is useful in some situations it's not really geo-distributed because there are many situations in which the data is requested each time from the remote backend (eg: authenticated sessions). In these situations a reverse proxy is actually a supplemental layer which actually slows things down. In these situations it would be faster to request the data directly from the remote server which eliminates the "geo-distributedness".

Another case when you might have problems is when the page is generated using data from a db. In this situation you either have the risk of serving stale data or you check freshness for each request which eliminates the latency advantage that a real geo-distributed service would have.

On the other hand the services that we offer use a complete replica of the website and database on each server. Data changes are pushed to the end servers. This way you don't ever talk to a remote server when the client requests data and the data is also assured to be fresh. As a good side effect this approach also spreads the load across different servers and one of the good things about this is that you're less vulnerable to single-server attacks/overload/failure.

The bottom line is that software like Varnish has its' usage patterns that it's good at. But if you want something really geo-distributed Varish and such are not enough.