
|
View Full Version : Suggest server specs for a large web site?
tennisguy 06-05-2001, 05:59 PM Hi there,
What is the optimal specs for a high-traffic web site? I know there are numerous configurations and I have boiled it down to several key questions.
Our sites gets 50,000 visitors a day (web search engine and directory service) and is expected to reach more than 1 million visitors a month in the next quarter. So scalability is extremely important.
Each pageview involves heavy MySQL database selects and updates. Monthly pageviews are higher than 10 million.
Here are the key questions:
(1) Is it better performance-wise to have a HW RAID 1 setup with 2 SCSI hard drives, or a HW RAID 5 setup with 3 SCSI hard drives?
I have heard that RAID 1 gives you data redundency so that if one hard drives fails, another hard drive protects the data - is this also true for RAID 5?
If this is true, then the key is performance. I know that RAID 5 gives fast database reads, but writing to the hard disks is slower. Does the increased read speed usually offset the slower write speed for RAID 5?
Also, what are the pros and cons of having more than 3 SCSI disks (i.e. 5) for RAID 5?
(2) I have the choice of getting a dual p3 800 mhz or a dual p3 1 gz for $85 more per month. How much increased performance will I see with the dual 1gz? Is it in your opinion worth the $85 more per month?
(3) Is it better to initially have only one server with both the database and the scripts on it or is it better to have a dedicated database server and a pure web server with scripts? Performance-wise, what are the differences? Pricing-wise, the dedicated database server costs about $400 more.
Your advice is greatly appreciated!
Thanks!
Fremont Servers 06-05-2001, 07:03 PM RAID 1 is a simple mirror. So with 2 40 GB drives you get 40 GB of space. One drive fails the other will work while a replacement is installed then all the data needs to be copied back onto the new drive.
With RAID 1 you have to add 2 drives at a time and they will be separate
RAIDS. I.E. you have your original RAID of 2 40 GB drives. You can't add to that raid it is fixed. You can add 2 more 40 GB drives and have 2 40 GB RAID's but they are separate. In Linux terms they will be mounted into different places.
With RAID 5 you have to start at 3 drives. You can add more later and add them to the RAID to make it bigger. Also if you have 4 drives you can use 3 of the for the RAID and one as a hot spare to be used automatically if a drive fails. This will bring array down time to none. even it 2 drives fail the hot spare will take over for one and the information for the other will be stored in the array until it is replaced.
For (2), it is hard to tell the differences between dual 800mhz and dual 1000mhz. Maybe dual 1000Mhz would make you feel stronger. ( the more the better, right) Maybe someone can disagree with me on this. :D
To me, it is not worth paying an extra $85 per month. It is up to you, your money.
For (3), if you have the money, having one server specializes in web hosting and the other server specializes on database are not a bad idea. Too bad, I am not as rich as you so my only option is to have everything on one server. ( I know you want to show people that you are rich, and try to put poor people like me down:bawling: )
- Asia
tennisguy 06-05-2001, 10:17 PM Hi Asia,
I assure you I am not trying to show off. The reason I am even looking for a new host is because rackspace is getting very expensive for me! LOL
Based on your reply, it seems that a RAID 5 is the better option. So with 3 drives on a RAID 5, do you lose data if one drive fails?
Also with the dual 800 vs. dual 1000, the reason is that
the cost to go to 2 servers for me at cybercon is going to be 780 per month for a load balancer, so I need to keep to one server as logn as I can. I just wander is dual 1000 that much more powerful than dual 800.
Thanks!
Fremont Servers 06-05-2001, 11:05 PM I was just kidding about the show-off thingy. :D
If you have more money, you might want to have (1) for HTTP, (1) for database, (1) for mail server, and etc.
[ I learned this in my economics class. This is called comparative advantage. Each servers is like a country. One country ( server 1) does best in mail, another country (server 2) does best in HTTP, and another country (server 3) does best in database. Each can specializes in producing their absolute advantage and trade-off. ]
If a drive fails on RAID 5, the RAID will keep working and
when you replace the drive that failed, it will rebuild the RAID array to bring it back to optimal conditions.
With RAID 5 you do lose some available space on the drive due to the parity information.
So if you have (5) 40GB hard drives your RAID drive will be 160GB ( no, my math is not wrong. You lose one drive of total space).
RAID 5 requires at least 3 drives and cannot be done with IDE drives correctly (IDE bus works it doesn't support it very well)
RackMy.com 06-05-2001, 11:46 PM If you go with RAID 5, make sure you have a hot spare ready to go. RAID 5 is not totally reliable, but pretty darn close. I was talking to a tech last week that said he had 2 drives fail in a RAID 5 (yes anything is possible) and wiped out the array.
If you have a hot spare, you can tell the controller to start the rebuild as soon as one drive fails. The rebuild process can take a while depending on the controller and drive size so you want to get a head start as soon as possible.
We run hot spares on all our RAID 5 systems.
I would always spend the money to get more memory instead of CPU. The difference in the 800-1G is probably not worth it unless you need the extra CPU power.
On you database questions, are you looking at Microsoft solutions or Linux/Unix solutions.
If you are looking at Microsoft solutions, keep them on seperate servers. Microsoft products are memory/CPU hogs.
Good luck with you systems.
I'm replying to both you post.
Right now you are having 50000 visitors a day and you expect it to hit 1 million visitors in 3 months time. But the figure doesnt tally. 50000 a day = 1.5 million a month. Anyway, so i assumed that you expect high growth. I can also see that you are trying to save some money by just having 1 server. Thus, you should go & grab the DUAL 1ghz since you will need it. Give it as much ram as you can as well eg 1GB. If you can afford it, get 1 server for Web serving & another for database, this config definately get better performance.
Cybercon charges bandwidth by the block and it is definately a waste if you purchase a 320GB block and utilise only 100GB. So a provider that charges you extra per GB will be ideal for your site. By the way, how much GB are you currently using now?
Regarding, Rackspace support, i'm very sure they are one of the BEST in the market and it is definately hard to find a host that provides better. But off course, there is a lot of hosts who cares and offers really good support.
Is there any reason for you choosing cybercon or any good host will be fine for you? If any will be fine for you, i would suggest you take a look at other host and talk to them regarding your needs and let them customise a plan for you. Since you are having consuming so much bandwidth, talk to them about buying them in blocks of 50GB. They could offer much cheaper pricing.
Give www.venturesonline.com an email & tell them your needs. I know they are in verio data center and have DUAL 1ghz in their plans. I think they offer like 200GB inclusive of the plan and additional is charged at $3/GB. But talk to them about buying in bulk as most host are pretty flexible. You could also check out many recommendable host here that offer great support. Some of them are,
wizardshosting.com
pwebtech.com
venturesonline.com
site5.com
Madman2020 06-06-2001, 10:33 AM Originally posted by Asia
I was just kidding about the show-off thingy. :D
If you have more money, you might want to have (1) for HTTP, (1) for database, (1) for mail server, and etc.
[ I learned this in my economics class. This is called comparative advantage. Each servers is like a country. One country ( server 1) does best in mail, another country (server 2) does best in HTTP, and another country (server 3) does best in database. Each can specializes in producing their absolute advantage and trade-off. ]
If a drive fails on RAID 5, the RAID will keep working and
when you replace the drive that failed, it will rebuild the RAID array to bring it back to optimal conditions.
With RAID 5 you do lose some available space on the drive due to the parity information.
So if you have (5) 40GB hard drives your RAID drive will be 160GB ( no, my math is not wrong. You lose one drive of total space).
RAID 5 requires at least 3 drives and cannot be done with IDE drives correctly (IDE bus works it doesn't support it very well)
I believe the way it works if you have 5 36GB Drives, you have:
1 drive for parity
1 drive as a hot spare
3 drives used for data totalling 108GB (3x36GB)
Technically if one drops off, it will still need to rebuld with the spare, "I think".
Racin' Rob 06-06-2001, 10:48 AM Raid 5 - Stripe set with parity
You must have a minimum of 3 physical disks (up to 32)
The partition used from each disk must be equal in size to the rest. so if you have 3 x 40GB HDD, you can have one partition of 40GB in each disk.
When data is written in this manner, for simplicity sake, it writes on bit on one disk, one bit on the next disk, and the next disk a parity bit (the sum of the binary data of the bits written to the first two disks. The parity bit can be written to any of the 3 disks for that particular set of data.
If one disk crashes, you replace it and the raid controller can simply rebuild it's data using the data from the other two disks. It can calculate the data that was lost from subtracting the bit value from the parity value.
Using only 3 disks in your stripe set, you will have used 120GB of storage space, but only have 80GB of storage available. The parity bits take up the space of one HDD.
If you have more disks available, you sill still only lose the storage space of one HDD.
However, if you lose two of the physical disks from the stripe set, you will lose all that data. Thus backups are still very important. Raid 5 cannot be substituted for a good backup plan.
Aloha,
Tennis guy
for a good explanation of raid
try here :
http://www.adaptec.com/worldwide/product/markeditorial.html?cat=%2fTechnology%2fRAID&prodkey=quick_explanation_of_raid&type=Technology
http://www.adaptec.com choose raid
also try here and there is a lot of reading:
http://www.dell.com/us/en/esg/topics/products_resource_pedge_000_rc_techmain.htm
but it can gain ya som egood thoughts
its geared toward there product but still good info you can apply to anything
best of luck what is the url to the site you have ???
freakysid 06-06-2001, 09:00 PM This has been an informative thread. I have a free email box at mailandnews.com and earlier in the year the service went down for about a week (could have been longer). I wish I had kept the notice they put up on their home page about it. Essentially, it would seem that something had gone wrong with one or more of the HDD in their array because the notice said something like "... RAID is restoring itself, we don't know how long this will take but we are guessing that maybe we will be back online by next Wednesday". Eeek! What was that all about? If it takes over a week for RAID to rebuild/remirror then that's not a good recovery plan!
Racin' Rob 06-06-2001, 09:21 PM If the partition size is very large and the server is busy, it could take a long time, but a week may be an over exageration. I can see 24-48 hours max. But, as I don't know the details surrounding the server, I can't make a specific judgement in this particular case.
Aloha,
on a fun side note when worked for HP
the tech guy at this demo building was showing me a raid configuration with 5 drives in it
I remember 1 drive sat vertical while 4 were regular horizontal
the side one carried the rebuild info but the rebuild info was spread along the other drives as well for redundancy
(any body might know what this was ??)
I think it was a raid 0/1 -10 can not remember but pretty cool
18 gig drives he would pull 3 out and swap places etc... trying to trip the machine up it was cool he had an onscreen program you could watch the arrays been built
(kinda looked like partitionmagic)
the box was a 4 proccessor with over a gig so that helped speed I am sure and it was a very expensive box running HPUX but man would be sweeeeeet to have
anyways later ;)
|