I'm actually just testing this right now. I am testing a 18 million row table with 24 columns on a standard Dual Core, 1 GB RAM machine with only 1 SATA Hard Drive. Seems this table is too big for this machine. Everything I am looking at is pointing towards the hard drive being the slowest factor. (As well as the 1 GB of RAM being slightly limiting)
I've tried in MySQL and SQL Server with quite little difference. The query I'm using and the indexes being the most important part of the performance.
Right now I'm thinking I'll wind up doing 8 GB of RAM with six 15k hard drives set up in a RAID 5 fashion. Never done RAID 10 myself - and don't see much benefit in doing that. Raid 0 gives no fault tolerence - so that's out and RAID 1 is a complete waste of disk space. RAID 5 seems to be the best option.
You may want to consider finding a DC with iSCSI support to SAN. Many can provide this service and all you would need in your hosts is a iSCSI HBA to mount to the SAN and you will get gig speeds or more and have excellent performance. You can also use software based iSCSI emulation if you do not want to cover the cost of a HBA and just use a good 1000mb/s NIC or 1000mb/s with iSCSI support.
You will have to pay per GB of storage on the host but it may work out well in the long run since it will be stored on a great platform and will allow you to drop in more front end hosts with similar HBA's or other uplinks and mount to the same DB if you wish to include load balancing down the road.
RAID10 is far more preferable for a production database than RAID5. I've seen way too many issues with production databases on RAID5 to ever recommend it to someone. RAID5 served a purpose in the 90s when disk prices we're much higher, but I'm a firm believer that in this day and age pairing a relational database and RAID5 is a bad mix. Ask any DBA that has worked in high volume production environments about RAID5 and get ready to cursed out about how management wanted to save a few bucks and they had to deal with mangled arrays, seriously degraded performance for hours or even days during rebuilds, etc. RAID5 just isn't worth the headache IMHO. I'm actually pretty restrained on the topic; this group of DBAs passionately hate RAID5:
Don't know yet how many queries per second. I'm trying to plan on 5 to 10 queries a second. (ya - I know!)
Thats an idea I might look into. I've never dealt with iSCSI or SANs before - and could be a good option. Typically what is an average price a datacenter will charge per GB?
Pricing is typically volume or commit based. Expect to pay anywhere from .50c-.75c for a 500GB-1TB commit or 1-2$ per GB for a 200-500+GB Commit and likely 2+$ per GB for a smaller < 100GB commit. It is worth it since the performance is great and you can always expand out without the need of downtime to install drives or move to a new RAID box etc. I am not sure what OS you plan to use but some will even boot from a SAN / iSCSI if properly configured and the HBA supports it. Windows and RHE have good support for this.