
|
View Full Version : RAID 1+0 vs. RAID 1
mind21_98 08-09-2002, 10:04 PM I'm considering RAID for a future server. From what I read, RAID 5 burns up all its time in parity generation. I also read that in RAID 1, only one drive is touched unless it fails. RAID 1+0 is a combination of RAID 0 and RAID 1.
Even though RAID 1+0 only has 50% storage efficency, wouldn't it be faster than just plain RAID 1 because of the higher throughput gained by reading off of multiple disks at once?
I really dont know how fast it will be, compared to raid 0,
But efficient Definetly. but remind u this setup is expensive.
regards/-
Lagniappe-labgeek 08-10-2002, 08:30 AM I suspect the reason you haven't seen many answers is that they're highly dependent on many factors. Let me explain the 4 RAID setups you talk about and maybe you can come to a conclusion for what you want to do.
RAID 0 - data is split between disks (striping) in xKB chunks. The size is usually setable during creation. Yes this can improve performance if the request is at greater than 1x the chunk size. Writes are also spread across the disks so they can be improved as well. No fault tolerance - 1 drive fails and you're hosed. All data on BOTH disks is unusable. Somewhat more expensive usually as you need 2x the drives but 1/2 the size.
RAID 1 - all data is written to both disks (mirrored). The biggest benefit here is fault tolerance. If a disk goes out, the other's there will all the data. Read requests can be faster (if the controller/driver is done right) - 2 disks get the request the first one to have it available wins. Writes are SLOWER. Because you have to do 2x the number of writes. Also expensive - 2x the cost. Mirroring is really handy if you need to backup a database while running. The right controller/software setup will let you "break" the mirror. This creates a snapshot of how the disks looked at that moment in time that can be backed up. Since databases are updating the same files constantly, backups tend not to like them. After the backup, you re-merge the mirrors replacing the snapshot with the current data. An alternative is a product like Verirtas's Advanced filesystem (also called OnlineJFs for HP-UX). This let's you do things you can't normally do while the filesystem is mounted, like extend/decrease it, or create snapshots where the actual fs isn't written to but the new writes are held on another partition waiting - very useful for backups! Don't know if Veritas is available for Linux though - never tried to find it - it can get rather expensive last one we bought was like $10K but it's saved more than that in downtime.
RAID 5 - Data is written to to X (X >= 3) number of disks (stripe plus parity) in x bits instead of x KB of data. You get ((X - 1) * size) in capacity - ie. you lose 1 disk for the parity. If you lose a drive the controller rebuilds the lost information by using the parity information. Lose 2 drives at the same time - all data is gone. Read and writes are spread across multiple drives, but the information is slightly larger for the parity bits. Cost is higher than RAID 0 but less than RAID 1 for the drives - thew controller is another thing and can cost more than the drives. Look for a controller with a RISC processor on it liek an Intel i960. This off-loads the parity burden to the controller. Also look to see if you can get one with onboard RAM cache - to further increase speed. If you're worried about data lose most good RAID-5 controllers allow a hot-spare. These drive hang around until needed. When a drive fails the controller puts the hot spare into action and rebuilds while continuing to run.
RAID 0+1 - (striped and mirrored) Required 4 Drives minimum - get 2x the drive size in storage - the most expensive, disk wise. Reads faster, writes helped by striping, hurt by mirroring. Somewhat useful when you need a large amount of space (greater than what 1 disk will give you) and need redundancy as well. But at 4x the drives needed why not go RAID-5 and get (3x drive size) space - a 50% bonus.
Hope I helped you with your question. If you still have questions just ask.
sitekeeper 08-10-2002, 10:07 AM I have IDE Raid 0 + 1 on my home PC and it is as fast as RAID 0 was in testing. I would recomend RAID 0 + 1 to anyone. I use to use RAID 0 and used Ghost to backup to a another drive.
bitserve 08-10-2002, 12:29 PM I'm not sure about that idea that RAID 5 burns up all its time in parity generation. This would be when doing writes, and RAID 5 is going to get faster writes than RAID 0+1.
You whould get faster reads on RAID 5, too.
Because 0+1 stripes your mirror over a pair of drives, your mirror is gone if you lose just one drive, so it's no more reliable than RAID 5.
I've found that the only advantage of mirroring over parity (0+1 vs 5) is that with mirroring you can be up again without waiting on a rebuild. When using RAID 1, the added benefit is you need less drives.
We've used RAID controllers that supported a hot spare for RAID 1, too. You definitley want a hardware RAID, no matter what method you select.
Does enabling write cacheing on your controller scare anyone else besides me?
I'd have to agree with labgeek for RAID 5 over 0+1.
We usually settle for just RAID 1.
PixyMisa 08-10-2002, 01:10 PM Raid 5 is slower than Raid 0+1 if you're doing random write - updating a database for example. With Raid 5, the controller must read all the other disks, calculate parity, and write the data block and the parity block. In other words, all disks in the array get hit with every write.
With Raid 0+1, only the pair of disks containing the data block in question need to be updated. Other disks can be performing other reads or writes at the same time.
This wouldn't be particularly noticeable on the average web server, but on large database systems Raid 0+1 dramatically outperforms Raid 5.
Lagniappe-labgeek 08-10-2002, 01:16 PM > Because 0+1 stripes your mirror over a pair of drives, your mirror is gone if you lose just one drive, so it's no more reliable than RAID 5.
Only 1 of the mirrored sets is lost with a single drive failure. Two drives can fail if on the same set. Two drives fail - one on each, and your hosed. RAID 5 can lose 1 drive but performance may suffer - or may not with intelligent controllers.
>Does enabling write cacheing on your controller scare anyone else besides me?
If it does, you're buying the wrong controllers. Caching should be battery backed up. I wouldn't touch any controller it doesn't. BTW, it's cached by most os'es anyway... Use a decent journalled filesystem and you'd be ok, but I'd still invest in a better quality controller with battery backup for the cache.
Lagniappe-labgeek 08-10-2002, 01:29 PM > Raid 5 is slower than Raid 0+1 if you're doing random write - updating a database for example.
Then you're doing it wrong - probably cheap hardware. RAID-5 controllers tend to be intelligent and have processors and memory of thier own. The OS will see much improved times over 0+1.
> With Raid 0+1, only the pair of disks containing the data block in question need to be updated.
Provided the block to be updated is less than the stripe size. If not both disks in both sets will have to be updated.
> This wouldn't be particularly noticeable on the average web server, but on large database systems Raid 0+1 dramatically outperforms Raid 5.
Not been my experience in 15+ years dealing with LARGE database systems on LARGE servers. My current production system fills 6 racks by itself. RAID 1, 0+1 and 5 all in use as well as optical and tape subsystems. Next years' additional disk system is > $100,000 by itself. We look at performance so much that we analyze which PCI bus it should go on - the machine has 4 PCI buses. Though I do agree with the not noticable on the average webserver part.
bitserve 08-10-2002, 02:11 PM Even with battery backup on write cacheing, it scares me. Your data is sitting in the cache, being powered by a 9V battery or equivalent. And then you have to pray that when you turn the machine back on, the data will make it to the drive and finish the write. I've never enabled it. I'm happy to use the controller's cache for reads, though.
For the 0+1 single drive failure, I did mean that your mirror is gone, and not your data. One of the pairs is dead, like you pointed out. You could sustain losing two drives, as long as they were on the same pair, but a single drive will take out the pair.
PixyMisa 08-10-2002, 10:39 PM Originally posted by labgeek
> Raid 5 is slower than Raid 0+1 if you're doing random write - updating a database for example.
Then you're doing it wrong - probably cheap hardware. RAID-5 controllers tend to be intelligent and have processors and memory of thier own. The OS will see much improved times over 0+1.
Cheap hardware? Like dual top-of-the-line EMC boxes, for example?
And it's not a question of doing it wrong: Raid 5 requires more disk I/O. Yes, you can cache stuff, as long as you have a properly redundant and battery-backed cache, but you can cache Raid 1+0 anyway, and get some real performance. Comparing caching Raid 5 controllers to non-caching Raid 1+0 kind of misses the point.
> With Raid 0+1, only the pair of disks containing the data block in question need to be updated.
Provided the block to be updated is less than the stripe size. If not both disks in both sets will have to be updated.
Stripe sizes are typically 32KB and up. Most databases use block sizes smaller than that. You should, of course, test the performance of different stripe sizes, database block sizes and OS block sizes when installing a large disk system.
> This wouldn't be particularly noticeable on the average web server, but on large database systems Raid 0+1 dramatically outperforms Raid 5.
Not been my experience in 15+ years dealing with LARGE database systems on LARGE servers. My current production system fills 6 racks by itself. RAID 1, 0+1 and 5 all in use as well as optical and tape subsystems. Next years' additional disk system is > $100,000 by itself. We look at performance so much that we analyze which PCI bus it should go on - the machine has 4 PCI buses. Though I do agree with the not noticable on the average webserver part.
I also have 15+ years experience with large database systems on large servers. 4 PCI buses is a small box :stickout Compaq Alpha clusters? Check. (Four two-way clusters, machines up to 24 CPUs and 64GB of memory.) Sun E10K? Check. EMC? Check. Computer rooms the size of football fields? Check. Terabytes of data? Check. Raid 5 bad for databases? Check.
But on a web server, the main point is that you don't lose your data. Raid 1, Raid 5, Raid 0+1: pick one, and sooner or later you won't have a really bad day. :D
bitserve 08-10-2002, 11:51 PM I also don't see how RAID 0+1 would have faster writes than RAID 5, but I haven't read much on it lately. Maybe you can point us to some sources?
My understanding of how it works:
RAID 5 (three drives):
One single file is written across 3+ drives, the parity is written to the distributed parity blocks. As far as I know, the parity is only checked when reading. Don't have to "read all the other disks" as far as I know.
RAID 0+1 (four drives):
One single file is written across only 2 drives, and on two different pairs, and those four drive's disks must be synced to do the write in parallel. Slightly slower than RAID 0 due to the syncing. It seems like an "update" should include all of the drives, so there are no other ones to multitask.
If you used RAID 0+1 with 6 drives in two sets, it should be faster than RAID 5 with three drives.
Of course, I only have 4 years of non-user level experience with RAIDs. :)
PixyMisa 08-11-2002, 05:01 AM It depends on whether you're doing large sequential writes or small random writes. For a given number of drives (say 4) in a single array, Raid 5 will win on large sequential writes and Rad 0+1 will win on small random writes.
Say we have disks A B C D.
In Raid 5, they're all striped together, with parity spread evenly across the drives.
In Raid 0+1 you have two stripes, AB and CD, mirrored together. (You can also do Raid 1+0, with two mirrors AC and BD striped together.)
If you have a 32k stripe size and a 16k database block size, a write to a block on A will work like this:
Raid 5: Read B and C. Calculate parity with new block. Write block to A. Write parity to D.
Raid 0+1 or 1+0: Write data to both A and C.
So a single small write to Raid 5 requires I/Os on all drives (neglecting caching), while on Raid 0+1 it only required writes to one pair in the raid set.
If you have a read-caching Raid 5 controller, the best possible result is that B and C are always in the cache, and Raid 5 performs as well as Raid 0+1 (but never better). More typically, Raid 5 requires more disk operations to perform a given amount of database operations, so giving lower performance.
Write caches can apply to any type of Raid set (or indeed any disk arrangement) and will improve performance. If your write cache does write combining (which gives most of the possible performance gain) and you lose the data from the cache, your database goes bye-bye. This is why most serious database folk twitch when you talk to them about caching controllers. EMC (for example) have put a lot of work into making their cache safe, but many PC controllers are bad news.
asyui8 08-15-2002, 07:12 PM Many people here complain about the raid performance deteriation. I think it is expected.
i read the storagereview.com about various raid card performance. My interpretation is, generally the IDE raid is good for redundancy, but not for performance.
those reviews are here:
http://www.storagereview.com/welcome.pl/http://www.storagereview.com/articles/index_2001.html
My reading of those articles are,
In the cheap IDE raid card case, it is more likely that the raid will reduce the hard drive performance, than increase it.
In the expensive IDE raid case, it is likely the highest increase in performance is only about 30%. In most case, it still does not increase performance.
The exception is the tranfer rate which raid has substantial performance boost. But for my webserver or most people, I do not think it is the most important benchmark.
Can raid people concur with me?
thanks.
mind21_98 08-17-2002, 12:01 AM Originally posted by labgeek
If it does, you're buying the wrong controllers. Caching should be battery backed up. I wouldn't touch any controller it doesn't. BTW, it's cached by most os'es anyway... Use a decent journalled filesystem and you'd be ok, but I'd still invest in a better quality controller with battery backup for the cache.
If you're even worrying about this, you're at the wrong NOC. :) Unless the power supply fails or something.
bitserve 08-17-2002, 01:10 PM Originally posted by mind21_98
If you're even worrying about this, you're at the wrong NOC. :) Unless the power supply fails or something.
Most places do not have their servers colocated at a full blown data center. They may have a large airconditioned room with raised floor and a UPS on the more important servers, but not an extra room full of batteries, or a diesel generator out back. Most places that I've had to implement a enterprise server have incompetent people who like to unplug servers or trip over power cords, or cleaning people to do it for them.
Also, loss of power is not the only thing that can bring down a server unexpectedly.
|