Eleven2 Hosting
09-08-2010, 04:28 PM
For those that have setup OnAPP. What SAN solution are you guys using? We are considering this and wanted to see what might be best to offer.
Thanks,
Thanks,
![]() | View Full Version : SAN for OnAPP Eleven2 Hosting 09-08-2010, 04:28 PM For those that have setup OnAPP. What SAN solution are you guys using? We are considering this and wanted to see what might be best to offer. Thanks, chennaihomie 09-08-2010, 04:41 PM Are you looking to lease servers from providers with SAN or gonna get your own hardware to colo? Eleven2 Hosting 09-08-2010, 04:41 PM We are only interested in purchasing our hardware. instantDS 09-08-2010, 04:51 PM Maybe you can contact Dell for the Equallogic series.. I've been playing with this hardware and it's very very nice and good priced:) chennaihomie 09-08-2010, 04:53 PM We are only interested in purchasing our hardware. I haven't implemented on own hardware yet. So no comments :) enotchnet 09-08-2010, 04:55 PM We are looking at getting setup once we finish some internal upgrades. If memory serves me right they (OnApp) recommend or at least list Open-E as an option. We are looking at a 2 mirrored X 15 disk arrays with the Open-E hardware card. However, Equallogic seems like an ideal solution as well. eming 09-08-2010, 05:51 PM yeah, the SAN is absolutely key for a successful implementation of OnApp. OnApp is 100% agnostic to the choice of SAN's, it works with more or less every SAN out there, but we do have a lot of clients who run's Open-E on their OnApp installations. Open-E does a good job dealing with SAN failover's, it's fairly priced, and easy to install/manage. But we also have clients using NetApp's, Equallogic's and even EMC's. You'll need to take a good look at your business model before deciding what SAN is right for you, what target segment you are going after and what level of redundancy and performance your clients are willing to pay for. We've done more than 30 OnApp installs in the last 6 weeks, and we would be happy to take that discussion with you to ensure that your choice of SAN's is fit for purpose. Btw, we recently took at good look at what the guys at http://www.acunu.com/ are doing - had a chance to meet up with them last week, good ppl. and amazing technology they've build...keep an eye out for them. :) D Eleven2 Hosting 09-08-2010, 05:53 PM Ditlev, What san does VPS.net use? eming 09-08-2010, 05:57 PM Ditlev, What san does VPS.net use? VPS.NET has more than 100 SAN's live now, and it did cause them a lot of problem initially. They seem to have nailed it now though, and the solution is performing excellent now. I can't give you details though, sorry - but I'd be happy to discuss your setup with you to make sure you'll get exactly what's right for your client segment. :) D ewitte 09-08-2010, 06:07 PM I got a free (with no drives or RAM) HP Proliant ML320s. The one with 12 3.5" bays. Installed openfiler on it and it is running great. I've only done testing from my two windows machines since I don't have the OnApp hardware yet. Cost $226 to get trays from Hong Kong to work with non HP drives. lostmind 09-09-2010, 12:21 AM Interesting. We're going through the exact same question right now. We'll likely just use our Dell EQ. But I'm not sure that's the best cost/benefit ratio. MikeTrike 09-09-2010, 12:44 AM We're going to utilize an previous generation EMC array that we have to get us fired off. Expand as much as we can with it; not likely to go to far though. Build a base, bring in some revenue and capital and pick up something more "qualified" for a larger scale deployment. :) If the EMC holds up, we'll probably pick up a newer EMC array as well. However we've been looking into things like the NetApp FAS2020 as a decent unit. RyanD 09-09-2010, 01:07 AM We're going to utilize an previous generation EMC array that we have to get us fired off. Expand as much as we can with it; not likely to go to far though. Build a base, bring in some revenue and capital and pick up something more "qualified" for a larger scale deployment. :) If the EMC holds up, we'll probably pick up a newer EMC array as well. However we've been looking into things like the NetApp FAS2020 as a decent unit. The EMC CX4-240's are quite affordable and have all the features of a 'modern' san your entry level vps market needs, thin provisioning being a big feature. It's also nice that you have the dual FC/iSCSI. WebGuyz 09-09-2010, 07:37 AM Since onAPP can have multiple data stores, does it load balance which datastore to put the VM on if you have multiple datastores? Also was wondering what type of raid most people was using for their SANs when using them in a VPS environment. For example what type of raid do the Equalogic boxs use, Raid 6 or Raid 10? Thanks! Caroline_9429 09-09-2010, 07:54 AM Since onAPP can have multiple data stores, does it load balance which datastore to put the VM on if you have multiple datastores? Also was wondering what type of raid most people was using for their SANs when using them in a VPS environment. For example what type of raid do the Equalogic boxs use, Raid 6 or Raid 10? Thanks! Hello, most clients go for RAID5. I hope that helps. Anyone need any further questions answered, feel free to get in touch: caroline[@]onapp.com Cristi4n 09-09-2010, 08:20 AM OnApp seems pretty fine, have to give some credits there. Load balancing something may be a little harder, you need to know how to balance something, based on number of VMs or space used or based on I/O together with the disk space used.I think you will have to manually take care of those, it's not such a good idea to implement such a feature in a software product unless it's a simple balance scheme based on disk used. An entire Raid5 may fail quicker than a Raid10, but again this is too much to discuss here. As I can see VPS.NET choose Raid5 based on price, most likely others did that too. Keep in mind that VPS.NET also has a backup, just in case, a storage is not something you will want to play around without some kind of redundancy, unless you don't care about the data on it. Not much about ZFS (Nexenta), why Open-E instead of it ? dazmanultra 09-09-2010, 08:37 AM I would not go for RAID5 or 6 with any kind of system that is to run VMs. Caroline_9429 09-09-2010, 11:18 AM I would not go for RAID5 or 6 with any kind of system that is to run VMs. HI Darren, out of interest would you use RAID10? Or...? eming 09-09-2010, 11:20 AM As I can see VPS.NET choose Raid5 based on price VPS.NET is not based on Raid5 SAN's - it's all Raid10 :) :) D Cristi4n 09-09-2010, 11:40 AM ah, sorry about that eming, I must have misunderstood the blog post on VPS.NET. Anyway, I do not think VPS.NET was clumsy enough to not take extra measures of protection regardless if it's a RAID5 or 10. @Caroline, your e-mail doesn't work for me, most likely because of the same reason why I needed to send VPS.NET a scanned ID. (554 Denied) nickn 09-09-2010, 11:42 AM VPS.NET is not based on Raid5 SAN's - it's all Raid10 :) :) D Funny timing on this post since we were talking about how we'd never use RAID5 and how scary it is in the office yesterday.. VPS.NET is definitely not raid5! :) RyanD 09-09-2010, 11:49 AM Funny timing on this post since we were talking about how we'd never use RAID5 and how scary it is in the office yesterday.. VPS.NET is definitely not raid5! :) I can't imagine how bad your performance would be if you were on R5 :) lostmind 09-09-2010, 01:01 PM Raid 5 can outperform raid 10 with enough spindles (in my testing, around 6-8 spindles, depending on drive and raid card types). In both raw throughput and IO. The problem with raid 5 imho is drive failure and especially a second drive failure during a rebuild. If you get a big enough raid 5 array... it's almost mathematically impossible to NOT have a drive die during the rebuild (or so I read somewhere) and I've experienced it first hand with 24 x 500gb drives in a raid 5 array. After that experience we switched to raid 10 for redundancy and suffered a rather impressive hit in performance. Luckily it was only a backup and could handle some drop in performance without affecting usability too much. In an enterprise san situation... I'm not sure if I'd run raid 10 or raid 5. Raid 5 and 2 hot spares on our Dell EQ is in nearly all testing outperforming raid 10. WebGuyz 09-09-2010, 01:14 PM The biggest problem for me with Raid 5 is the degraded performance when a single drive goes out and your controller is rebuilding on your hot spare. If you have a raid 5 (or raid 6) array of ten 2tb drives it can take several days to finish rebuilding . Raid 10 has much shorter rebuild times and thats why we use it instead of Raid 5 or 6 as well as for the performance. Cristi4n 09-10-2010, 06:05 AM I wonder if onapp was used with nexenta until now and if it worked ok ? Most I've seen are using Open-E or Openfiler, but I would like ZFS for various reasons. eming 09-10-2010, 09:42 AM I wonder if onapp was used with nexenta until now and if it worked ok ? Most I've seen are using Open-E or Openfiler, but I would like ZFS for various reasons. yup - we've got two clients working with nexenta with great success. One of them on a fairly large install. :) D Cristi4n 09-10-2010, 10:11 AM that's great news, thanks! I am still looking at nexenta and most likely will go with it after I make sure it's stable enough for production. CloudWeb 09-10-2010, 10:17 AM You can't pool local storage with OnApp? It requires external SANs? sailor 09-10-2010, 10:39 AM I can't imagine how bad your performance would be if you were on R5 :) I will second ryans opinion on that. I have tried it - it blows. RAID 10 is the only way to go. Ask your potential san provider as well what they are running - if they wont tell you what raid level assume its not 10 and run far away. RAID 10 costs more but its worth it. lostmind 09-10-2010, 10:41 AM You can't pool local storage with OnApp? It requires external SANs? Yup external storage. sailor 09-10-2010, 10:42 AM yup - we've got two clients working with nexenta with great success. One of them on a fairly large install. :) D I have heard of some issue second hand on nexenta - do you know what the management capability is of this project - is there a single pane of glass management for it? eming 09-10-2010, 11:10 AM You can't pool local storage with OnApp? It requires external SANs? OnApp can handle both local and central storage. The discusion on whats best (and that seems to be one of your favorite subjects;)) is almost religious - the storage features are pretty cool at OnApp, so you can add several storage units to the same VM, ie. attach both local storage, Raid10, SSD/FusionIO etc etc to the same VM. OnApp also allows you to charge individually for each tier of storage. :) D eming 09-10-2010, 11:11 AM I have heard of some issue second hand on nexenta - do you know what the management capability is of this project - is there a single pane of glass management for it? I'll put you in touch with one of the clients running it in Vegas next week, then you guys can discuss. :) D CloudWeb 09-10-2010, 11:12 AM I'll put you in touch with one of the clients running it in Vegas next week, then you guys can discuss. :) D Who's running what in vegas next week? I'll be there.. CloudWeb 09-10-2010, 11:14 AM OnApp can handle both local and central storage. The discusion on whats best (and that seems to be one of your favorite subjects;)) is almost religious - the storage features are pretty cool at OnApp, so you can add several storage units to the same VM, ie. attach both local storage, Raid10, SSD/FusionIO etc etc to the same VM. OnApp also allows you to charge individually for each tier of storage. :) D Does it pool local storage into a virtual IP SAN? Ie: is that data redundant and ha across multiple servers? And accessible from other servers in the cloud? Or is it just simply using a single, local storage array (whether it be raid, single sata drives, or whatever? eming 09-10-2010, 11:14 AM Who's running what in vegas next week? I'll be there.. an OnApp client setting up nexenta who will be in Vegas for the T1summit. CloudWeb 09-10-2010, 11:39 AM Wow.. how could I miss this. Who else is attending from WHT? MikeTrike 09-10-2010, 11:40 AM You can't pool local storage with OnApp? It requires external SANs? Just sign up for the web demo and they will show you all of this stuff. :) sailor 09-10-2010, 12:52 PM Wow.. how could I miss this. Who else is attending from WHT? I am. anyone want to go see the cult thursday night? sailor 09-10-2010, 12:53 PM I'll put you in touch with one of the clients running it in Vegas next week, then you guys can discuss. :) D ok cool - looking forward to some good cabernet and a big fat filet at smith and wolensky :D oplink 09-10-2010, 03:18 PM How about something like this for a SAN? http://blog.backblaze.com/2009/10/07/backblaze-storage-pod-vendors-tips-and-tricks/ Would this work with ONAPP? CloudWeb 09-10-2010, 03:28 PM ok cool - looking forward to some good cabernet and a big fat filet at smith and wolensky :D Morton's steak and seafood special is still going on.. and with the godiva molten hot chocolate cake it's a shoe in. I'll go there :D MikeTrike 09-10-2010, 04:10 PM How about something like this for a SAN? http://blog.backblaze.com/2009/10/07/backblaze-storage-pod-vendors-tips-and-tricks/ Would this work with ONAPP? If you were using it for "low performing" storage. i.e. for backups that would be a good idea. But that in general is probably not going to be a high performance setup at all. Heck one of the controllers is a regular PCI card. ewitte 09-10-2010, 05:18 PM Doing Onapp with the following SAN HP Proliant DL320s 10*450GB SAS 15k with 2 hot spares (iSCSI). Snapshot to R5 array of 1.5TB drives on another machine. Using openfiler and will try to get MPIO working off 4 Gbit connections. jayglate 09-10-2010, 07:58 PM Doing Onapp with the following SAN HP Proliant DL320s 10*450GB SAS 15k with 2 hot spares (iSCSI). Snapshot to R5 array of 1.5TB drives on another machine. Using openfiler and will try to get MPIO working off 4 Gbit connections. Super $$$ on that setup, what is that running? ewitte 09-10-2010, 09:02 PM Not really was pretty resourceful probably about $1.5k (not including the other machine) with the server beeing free and extremly lucky finding drives. Openfiler. jayglate 09-10-2010, 09:06 PM server and drives 1.5k? are they used drives? If not where did you get drives that cheap.. ewitte 09-10-2010, 09:25 PM Yes refurb reason I'm doing 2 hot spares and testing the $*#& out of them. Seagate will always replace them under warranty if any are bad. vivithemage 09-12-2010, 01:29 AM We use a Raid 6 + hot spare for our SAN/NAS box(s). Performed better in our test lab then Raid 10 with 12 2TB 7200rpm disks, so we went with it. We're also using openfiler, which works pretty darn well. WebGuyz 09-12-2010, 02:00 AM We use a Raid 6 + hot spare for our SAN/NAS box(s). Performed better in our test lab then Raid 10 with 12 2TB 7200rpm disks, so we went with it. We're also using openfiler, which works pretty darn well. How many hours/days does it take to rebuild if a disk goes out? Whats the perfomance like when its rebuilding? Have you guys tested it? dazmanultra 09-12-2010, 09:09 AM Disk i/o is one of the most important factors when providing virtualization. We use 8 x 300GB 2.5" 10K SAS in RAID10 for an ordinary VM node. For a cloud platform you're going to want something better than that - typically, you will want a large number of disks (spindles) in RAID10 in a system that features SSD acceleration. Cristi4n 09-12-2010, 09:21 AM regardless of how many disks you have, eventually you will still hit the I/O limit. Adding many disks can be a temporary solution but it's too costly and not the most effective one. As you probably seen, almost nobody tells their setup and there is a reason for that. You need to either tell a client or clients that create too much I/O that it's time to pay a little more (RackspaceCloud calculates a compute cycle based on I/O + CPU, so figures) or just move them to a different SAN. ewitte 09-12-2010, 09:23 AM We use a Raid 6 + hot spare for our SAN/NAS box(s). Performed better in our test lab then Raid 10 with 12 2TB 7200rpm disks, so we went with it. We're also using openfiler, which works pretty darn well. Thinking of using raid6 if I do SSD just because it costs so much otherwise. That being said I would get a really nice controller that had less penalty. Something like ARC-1880ix. Would cost like $5k for 2TB worth of space with a few hot spares but IMO would be worth it. IOPS would be be something like 50-100 times that of 10 15k sas drives. Also plan on being very upfront on the exact hardware I have in place. I'm a tech junky so its going to be high end just because I like making things go fast even if I'm just covering my costs I get cool stuff to play with. Most hosts its all about profit. Dougy 09-12-2010, 09:58 AM You can't pool local storage with OnApp? It requires external SANs? Aint that the point of a cloud? :eek: lostmind 09-12-2010, 01:01 PM Aint that the point of a cloud? :eek: Cloudweb uses applogic to power his cloud, which offers high availability and apparently better performance than a traditional San via it's "IP based San" as the storage is local to each node. OnApp however was built specifically for hosting and feels like a much better fit for the industry (and the success of VPS.net is a testament to this). Applogic (IMHO) was built more for enterprise... but I am only on week 1 of my Applogic demo. So we'll see how it goes. ewitte 09-12-2010, 01:22 PM iSCSI can be fast if its done correctly. Its pretty easy to do with 10Gbe but very costly. The other method is using more than 1 GBe port but can be a lot trickier to do properly. With SSD you should be able to pull off up to 100k iops off 4 MPIO links. I would consider 10Gbe before going past 4 though. Uncorrupted-Michael 09-12-2010, 01:42 PM We recently deployed some CORAID storage for our new onapp cloud. - Very cost effective, out of the box multipathing, and an already optimized network stack. win, win, win sailor 09-12-2010, 03:21 PM Doing Onapp with the following SAN HP Proliant DL320s 10*450GB SAS 15k with 2 hot spares (iSCSI). Snapshot to R5 array of 1.5TB drives on another machine. Using openfiler and will try to get MPIO working off 4 Gbit connections. what happens if your box fails? like pwer outage to it, motherboard failure, cpu failure, ram failure - etc etc. Isnt your whole san then down because openfiler wont do network raid? How do you manage the san if you grow it? isnt trying to manage a bunch of openfiler boxes a real pain since there arent any single pain of glass tools for reporting and intelligent management? I used to use openfiler for some basic back up boxes but it is really childs play and does not scale for real applications and cloud usage. Cristi4n 09-12-2010, 03:31 PM what happens if your box fails? like pwer outage to it, motherboard failure, cpu failure, ram failure - etc etc. Isnt your whole san then down because openfiler wont do network raid? Well, having local storage is somehow the same. I am not sure how Openfiler works exactly but you can always do network raid manually since we talk about Linux. True that most people don't (VPS.NET only recently added RAID1 I believe). How do you manage the san if you grow it? isnt trying to manage a bunch of openfiler boxes a real pain since there arent any single pain of glass tools for reporting and intelligent management? Openfiler is free, you can build your own monitor or pay for something else. The growing problem appears on almost every solution, monitoring depends. I used to use openfiler for some basic back up boxes but it is really childs play and does not scale for real applications and cloud usage. What do you use now ? WebGuyz 09-12-2010, 03:42 PM You really need to have a backup or auto-failover san for a cloud setup and periodic offsite backups of all vm's as well. Without a second backup san you can't do upgrades or fixes without major problems. Open-E has a nice autofailver that works well. I think onAPP has a deal with Open-E for a 16tb license for about $50.00/mth, but not sure about where I read that. onAPP is cool because you can have several SAN's and just attach a HV (server) to whatever datastore you want. Just bring up another box and tell the onapp Base server about it and its part of your datastore, like xenserver. Opensource may be inexpensive, but what the hell you gonna do when it craps out at 2 in the morning and there is nobody to call. That would keep me up night. It so damn easy to get a bad rep because your system bit the dust and your down for hours or days and every one of your customers tells the world and everyone on WHT about how your cloud is having a rainy day. :D eming 09-12-2010, 05:09 PM funny thing with storage - it's like it hasnt evolved together with all the other hosting-related hardware. It's like Moore's Law just doesn't apply for storage solutions. Sure, we got bigger drives, but size is not the problem - it's all about the IO/$ ratio. And from that perspective not much has happened in the last few years. Someone needs to crack that nut. :) D ewitte 09-12-2010, 05:48 PM what happens if your box fails? like pwer outage to it, motherboard failure, cpu failure, ram failure - etc etc. Isnt your whole san then down because openfiler wont do network raid? Thats the whole point of backups or even syncing to another iscsi device. colomondo 09-14-2010, 01:51 AM I see a lot of mention of vps.net in this thread? Are they using SSD, SAS, or SATA drives in their RAID10 deployment? colomondo 09-14-2010, 01:54 AM onapp.com has also been down for the past hour...not quite building my confidence in their software eming 09-14-2010, 04:37 AM onapp.com has also been down for the past hour...not quite building my confidence in their software ya, a DNS change took a while to update. Should be ok now though. I see a lot of mention of vps.net in this thread? Are they using SSD, SAS, or SATA drives in their RAID10 deployment? all of the above actually, and then some. :) D RyanD 09-14-2010, 04:44 AM Opensource may be inexpensive, but what the hell you gonna do when it craps out at 2 in the morning and there is nobody to call. That would keep me up night. It so damn easy to get a bad rep because your system bit the dust and your down for hours or days and every one of your customers tells the world and everyone on WHT about how your cloud is having a rainy day. :D That is exactly why although tested heavily immature solutions for distributed, replicated, clustered file systems like Gluster, Ceph, etc are not ready for production use in our space. Gluster is tryign hard with their new vmware integration and working to put commercial support behind the product but I don't think we're quite there yet. Without serious changes in the media (ie going all ssd) we're not going to see much further improvements in IO/$ out of traditional spinning media. Where we'll see the biggest gains is in the distribution and replication of the data at the file system level as clustered filesystems become more open and available dazmanultra 09-14-2010, 04:46 AM I see a lot of mention of vps.net in this thread? Are they using SSD, SAS, or SATA drives in their RAID10 deployment? Despite citing it as a case study for how good OnApp is, they won't tell you. ;p Caroline_9429 09-14-2010, 04:51 AM Despite citing it as a case study for how good OnApp is, they won't tell you. ;p Did you see this post in response to colomondo - I see a lot of mention of vps.net in this thread? Are they using SSD, SAS, or SATA drives in their RAID10 deployment? all of the above actually, and then some. :) dazmanultra 09-14-2010, 06:53 AM VPS.NET has more than 100 SAN's live now, and it did cause them a lot of problem initially. They seem to have nailed it now though, and the solution is performing excellent now. I can't give you details though, sorry Quoted from Ditlev. Caroline_9429 09-14-2010, 06:58 AM Quoted from Ditlev. Ah I see that, crossed wires. sailor 09-14-2010, 10:03 AM Thats the whole point of backups or even syncing to another iscsi device. A san is not supposed to go down. Users will not be happy if they have to wait for another unit to come up or restore from backups when there is a failure in a single node mickey mouse san setup. RyanD 09-14-2010, 10:37 AM A san is not supposed to go down. Users will not be happy if they have to wait for another unit to come up or restore from backups when there is a failure in a single node mickey mouse san setup. Any san regardless of your level of replication can and will go down. Regardless of the level of redundancy you think you have, backups are still a best practice that shouldn't be ignored. eming 09-14-2010, 10:51 AM Any san regardless of your level of replication can and will go down. Regardless of the level of redundancy you think you have, backups are still a best practice that shouldn't be ignored. having two san's in active-active state is a good start though :) ya, you still need backups obviously, but you'll go through a disaster without the stress of restoring live servers. :) D ewitte 09-15-2010, 10:04 AM Notice openfiler does replicate to another unit. It can be another one with cheaper drives. One question has anyone had any experience with Infiniband? It looks like it can be much cheaper than 10Gbe and obviously better performance than 4*4Gbe. As for backups I'm sure being down for a bit is less of a hastle than complete loss of data! CloudWeb 09-15-2010, 12:19 PM Notice openfiler does replicate to another unit. It can be another one with cheaper drives. One question has anyone had any experience with Infiniband? It looks like it can be much cheaper than 10Gbe and obviously better performance than 4*4Gbe. As for backups I'm sure being down for a bit is less of a hastle than complete loss of data! We have quite a few pair of Openfiler boxes (not in our Clouds) setup with HA+DRBD+RSYNC. To say the least, it's less than optimal and not a reliable way to achieve high availability and redundancy. It works alright for NAS type devices for backups, or low level non-critical storage but I'd never use this in a Cloud as primary storage. If a provider were to do this they should carefully consider the clientele on this Cloud as if any mission critical applications are running with a high valued client it could put the provider in a very bad place when things start to go wrong. ewitte 09-15-2010, 12:50 PM Thanks I pushed back my project for 6-8 months (with a much bigger budget) so plenty of time to read up and bother OnApp tech support. I'm really focused on reading about infiniband right now. Seems you can get 10Gb fairly cheaply. Think I'm going to keep my test box so I have 6-8 months of messing around. kris1351 09-16-2010, 11:39 AM Any san regardless of your level of replication can and will go down. Regardless of the level of redundancy you think you have, backups are still a best practice that shouldn't be ignored. You are speaking the truth there, I have been on many a site where we were called in to fix huge EMC arrays. Usually ending up with tape restores taking days to put back online. There isn't any solution that is 100% bullet proof. ewitte 09-16-2010, 04:30 PM Thats probably why most people only guarantee 99.9% uptime. sailor 09-16-2010, 08:59 PM Any san regardless of your level of replication can and will go down. Regardless of the level of redundancy you think you have, backups are still a best practice that shouldn't be ignored. Respectfully ryan - you missed my whole point I was making. I agree that you need backups and they should be offsite. this will protect the data itself. I would not necessarily agree that a san WILL go down. It can go down but my point was that if you construct it with a single node setup - your likelihood that it will go down dramatically increases. Taking the approach from the beginning that you will build it on the cheap low end and if they lose their data- then just restore is not a good practice imho. Stratogen 09-27-2010, 05:16 AM I'd only ever use RAID10 for this kind of setup. You shouldn't even think of RAID5 if you want to keep customers. RAID50 is the middle ground if you are looking for additional space. As has been mentioned, the SAN should never go down - it's what your whole cloud platform relies upon so invest in the best you can afford and run it with RAID10. wartungsfenster 09-27-2010, 06:34 AM I'd only ever use RAID10 for this kind of setup. You shouldn't even think of RAID5 if you want to keep customers. RAID50 is the middle ground if you are looking for additional space. As has been mentioned, the SAN should never go down - it's what your whole cloud platform relies upon so invest in the best you can afford and run it with RAID10. I agree, two years ago we settled on a used EMC CX700. Huge mirrored, battery backed cache, Raid10 over quite a few drives and it didn't give us performance troubles ever. On the oother hand, anyone who can look at EMC should walk right on to Hitachi AMS. Far more robust design: no active/passive "design" , no FC-AL loop backend, instead they use many sas links. These beasts also managed to give usable throughput and IOPS for 4-disk raid5's. Citrix XenServer used to mess up a lot on A/P arrays I stay away from both if I can. KDAWebServices 09-29-2010, 09:05 AM Cloudweb uses applogic to power his cloud, which offers high availability and apparently better performance than a traditional San via it's "IP based San" as the storage is local to each node. OnApp however was built specifically for hosting and feels like a much better fit for the industry (and the success of VPS.net is a testament to this). Applogic (IMHO) was built more for enterprise... but I am only on week 1 of my Applogic demo. So we'll see how it goes. I wouldn't say it had better performance - inside of VMs we were seeing poorer performance than with systems with a simple RAID-1 setup, about 60MB/s disk. Although that was 2.4, things may have improved since then - but we've had problems with that way of taking local storage and turning it in to an IP SAN, as one point it reverted a volume back to a copy from 6 months previous due to the mirror being split and us not being alerted. We've tested: - SANmelody - v. v. v. good, but you do pay for it. - Nexenta - seems very hit and miss if it works for you out of the box, we got v. v. v. poor speeds out of the box on the same hardware - EQL - again v. good but you're limited in your interface speeds (although there are single port 10GE models now, not ideal though). - Open-E, just about to give it another whirl on our newer test hardware. Never been hugely impressed with it in the past though. - StarWinds - where to begin with them? Compare themselves to SANmelody when they are nothing alike. Cheers, lostmind 10-01-2010, 12:08 AM I wouldn't say it had better performance - inside of VMs we were seeing poorer performance than with systems with a simple RAID-1 setup, about 60MB/s disk. Although that was 2.4, things may have improved since then - but we've had problems with that way of taking local storage and turning it in to an IP SAN, as one point it reverted a volume back to a copy from 6 months previous due to the mirror being split and us not being alerted. We were seeing similar things. Waiting for a response back from CA/3tera support on this exact issue. 4 x sata drives per node, 4 nodes, and we see performance hanging around 50mb/s. Try benchmarking with multiple "vps's" setup and performance drops terribly across the board. That's with forcing the storage to be on the same node as the "vps". If storage is on another node, cut performance in half. oi. Honestly, we see better performance out of our openfiler box (connected via iscsi) that we use for backups of backups and non critical stuff. And our Dell EQ is leaps and bounds better. So, I'm really confused about Applogic's claim of better performance vs network attached storage. Withholding my judgement until their techs get back to us though. Perhaps we've overlooked the caching button somewhere in this counter intuitive control panel. FHDave 10-01-2010, 12:26 AM - EQL - again v. good but you're limited in your interface speeds (although there are single port 10GE models now, not ideal though). How is EQL more limited in your interface speed compared to other iSCSI SAN? EQL backed can linearly scale up to 36 Gbps on the PS5xxx series, up to 48 Gbps on the PS6xxx series, and up to 120 Gbps on the PS6x10 series, assuming 12 arrays per group. BTW, PS6x10 has two 10 GE ports per controller. How much is SAN Melody, if you don't mind sharing? How scalable is it? CloudWeb 10-01-2010, 12:47 AM We were seeing similar things. Waiting for a response back from CA/3tera support on this exact issue. 4 x sata drives per node, 4 nodes, and we see performance hanging around 50mb/s. Try benchmarking with multiple "vps's" setup and performance drops terribly across the board. That's with forcing the storage to be on the same node as the "vps". If storage is on another node, cut performance in half. oi. Honestly, we see better performance out of our openfiler box (connected via iscsi) that we use for backups of backups and non critical stuff. And our Dell EQ is leaps and bounds better. So, I'm really confused about Applogic's claim of better performance vs network attached storage. Withholding my judgement until their techs get back to us though. Perhaps we've overlooked the caching button somewhere in this counter intuitive control panel. Sorry to hear of your trouble. We're getting very good IO with AppLogic. Here's a result of a standard hdparm test from a SATA drive: # hdparm -tT /dev/sda1 /dev/sda1: Timing cached reads: 23432 MB in 1.99 seconds = 11755.30 MB/sec Timing buffered disk reads: 308 MB in 3.01 seconds = 102.40 MB/sec I've been seeing these kind of times since 2.8.5. FHDave 10-01-2010, 01:02 AM In my experience, hdparm doesn't really tell you much in the real working environment. Try something different. E.g., doing linear copy of a 8 GB file. You can do this using "dd" with varying blocks (e.g. 64 KB, 16 KB, etc). CloudWeb 10-01-2010, 01:30 AM In my experience, hdparm doesn't really tell you much in the real working environment. Try something different. E.g., doing linear copy of a 8 GB file. You can do this using "dd" with varying blocks (e.g. 64 KB, 16 KB, etc). Yeah I know what you mean. I made a 16GB test file and did the following: # time dd if=ddfile of=/dev/null bs=8k 2000000+0 records in 2000000+0 records out 16384000000 bytes (16 GB) copied, 150.444 seconds, 109 MB/s real 2m30.477s user 0m0.230s sys 0m2.490s lostmind 10-01-2010, 02:32 AM How is EQL more limited in your interface speed compared to other iSCSI SAN? EQL backed can linearly scale up to 36 Gbps on the PS5xxx series, up to 48 Gbps on the PS6xxx series, and up to 120 Gbps on the PS6x10 series, assuming 12 arrays per group. BTW, PS6x10 has two 10 GE ports per controller. How much is SAN Melody, if you don't mind sharing? How scalable is it? I think he was referring to the fact that the EQ's only have 4 active gige ports per unit. Sure you can add more units to the mix and get more throughput, but that is pretty damned pricey. :) Sanmelody hides their pricing. Which usually means, if you have to ask then you can't afford it. I'm in the midst of trying to figure pricing out myself and will gladly share when I get anything firm. Right now, all the rep has told me is it is not cheap and he'll fill me in tomorrow. KDAWebServices 10-01-2010, 02:36 AM We were running with 8 drives in a RAID-10 (bot an officially supported out the box config) on Adaptec and Areca cards and the best we ever saw was 60MB :(. In the end we decided it didn't quite fit with how we needed to work with our customers - very nice ode, but not what our customers in the end needed. lostmind 10-01-2010, 02:39 AM Yeah I know what you mean. I made a 16GB test file and did the following: # time dd if=ddfile of=/dev/null bs=8k 2000000+0 records in 2000000+0 records out 16384000000 bytes (16 GB) copied, 150.444 seconds, 109 MB/s real 2m30.477s user 0m0.230s sys 0m2.490s Honestly, 109MB/s isn't impressive but it is definitely "good enough" * edit, didn't mean to sound rude, my apologies if it seems that way *. Sadly, that is MUCH better than we are seeing. Of course, since CA bought 3tera, we've had a rough time getting support. Our tests are being run across 4 nodes, each with an x3440, 16gb ram and 4 x 500gb WD RE3. All setup according to their recommendations. Switching doesn't make much difference, we get very similar results with our 3560g, dell 6224 and our procurve's (which is all the test switching gear we have in the office). Setting jumbo frames on the switch and the nodes made little difference either, so I suspect there is something else we may be overlooking. We're going to try a few more tricks tomorrow but I am still hoping to hear back from CA/3tera directly and maybe they'll have something for us. KDAWebServices 10-01-2010, 02:42 AM I'd be more interested in seeing Bonnie++ results with -n 512 :) Our nodes are (we do still have it running) dual quad Barcelona options with 32gb ram, 8x500/640gb in raid-10. lostmind 10-01-2010, 02:49 AM I'll post some bonnie scores tomorrow when I get to the office (if I remember). FHDave 10-01-2010, 03:02 AM I think he was referring to the fact that the EQ's only have 4 active gige ports per unit. I am not sure what's wrong with that? Most in house iSCSI SAN would only have 1 or 2 Gbps. And compare to the other devices mentioned in the list, it seems to me EQL is the best one in terms of backend capacity, even on a single shelf solution. Sanmelody hides their pricing. Which usually means, if you have to ask then you can't afford it. I am sure I can afford SAN Melody, having purchased $150K+ worth of Dell EQL boxes :) But Even if you can afford it, you will still need to ask the pricing. You don't just sign the sales contract blind-folded :) I'm in the midst of trying to figure pricing out myself and will gladly share when I get anything firm. Right now, all the rep has told me is it is not cheap and he'll fill me in tomorrow. Do let me know. Thanks! FHDave 10-01-2010, 03:09 AM Yeah I know what you mean. I made a 16GB test file and did the following: # time dd if=ddfile of=/dev/null bs=8k try doing write test, as it is usually worse than read. Can you remind me what you are using and the interface of your iSCSI? KDAWebServices 10-01-2010, 04:23 AM SANmelody - I only have uk pricing for somewhere, it's not a cheap solution, was just shy of £7k for 2TB full HA. Our own SAN heads have 2 or 4 x 10GE in them, we've tried pretty much all the software worth testing. CloudWeb 10-01-2010, 09:23 AM It's not meant to be impressive, it's a SATA drive. I'm simply showing you that your problem is not an inherent architecture problem. Impressive is the benchmarks on our new 2x Westmere server Cloud w/ 7x Intel X25-E's (per server) :) lostmind 10-01-2010, 10:52 AM It's not meant to be impressive, it's a SATA drive. I'm simply showing you that your problem is not an inherent architecture problem. Impressive is the benchmarks on our new 2x Westmere server Cloud w/ 7x Intel X25-E's (per server) :) I'd definitely be curious how that platform did. We use similar boxes for our shared hosting product, so I could compare against our numbers. CloudWeb 10-01-2010, 10:54 AM You use Westmere's w/ enterprise SSD for your shared hosting product? Really? wow how do you make any money. haha lostmind 10-01-2010, 10:54 AM I am not sure what's wrong with that? Most in house iSCSI SAN would only have 1 or 2 Gbps. And compare to the other devices mentioned in the list, it seems to me EQL is the best one in terms of backend capacity, even on a single shelf solution. Because if I was to build a home brewed san, I'd have much more connectivity than that. 10gige or so. I dislike that the EQ controllers don't do active/active. I am sure I can afford SAN Melody, having purchased $150K+ worth of Dell EQL boxes :) But Even if you can afford it, you will still need to ask the pricing. You don't just sign the sales contract blind-folded :) Sorry, I didn't mean you in particular. I'm much more budget oriented. If the price isn't listed, typically it's because the vendor is trying to gouge each client for as much as they can. You get the used car salesman routine, etc etc. I hate that. Do let me know. Thanks! Will do. lostmind 10-01-2010, 07:14 PM You use Westmere's w/ enterprise SSD for your shared hosting product? Really? wow how do you make any money. haha Gulftowns and enterprise ssd but yah. We don't charge $2.99/m for hosting? lostmind 10-01-2010, 08:31 PM So on my little applogic cloud I see: ----- # time dd if=ddfile of=/dev/null bs=8k 500000+0 records in 500000+0 records out 4096000000 bytes (4.1 GB) copied, 65.9835 seconds, 62.1 MB/s real 1m6.019s user 0m0.000s sys 0m0.020s ----- Bonnie: ----- Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP apptest3 2128M 475 99 35913 2 17471 0 1007 99 47470 0 396.4 0 Latency 17032us 1205ms 1144ms 29110us 266ms 603ms Version 1.96 ------Sequential Create------ --------Random Create-------- apptest3 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 31033 32 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ Latency 33335us 475us 494us 219us 9us 30us ----- To put that into perspective, a single 1tb black gives 100859 read, 77482 read and 332 seeks with the same bonnie run on a similar platform (x3440 & 8gb ram). That said, Applogic guys are going to get on a conference call with us on Monday. They believe they have some input on how to make this setup run much faster. So I'll reserve my judgement till then. Just so people know what hardware Applogic is running on, we have four of these for our test grid: Supermicro x8sil-f Intel Xeon Lynnfield X3440 16gb DDR-1066 4 x 500gb WD RE3 Sata drives FHDave 10-02-2010, 09:38 AM In my opinion, the performance is good already. I don't think you can make it any (significantly) faster. In addition to seek time on the physical drive itself, you must also factor in latency time on the network itself. KDAWebServices 10-04-2010, 08:26 AM Oh I don't know, we get better iSCSI results then that just doing a test against a single SATA drive, so there should be room for them to improve that. Be interested to see your results with -n 512 passed to bonnie though - shouldn't skip any results then. lostmind 10-04-2010, 06:54 PM As requested, with -n 512 ----- Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP apptest3 2128M 476 99 45362 3 24754 0 1008 99 66510 0 409.9 0 Latency 17071us 898ms 1012ms 30242us 34981us 672ms Version 1.96 ------Sequential Create------ --------Random Create-------- apptest3 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 512 41535 54 301576 99 3528 2 46603 62 343967 99 1942 0 ----- Just got off a con call with CA/3tera. They are going to work with us to improve these numbers. Will update it later. Going to deploy our Onapp test with our Dell EQ for now, see how it compares. MikeTrike 10-04-2010, 07:16 PM Also just a note, EMC = Garbage. RyanD 10-04-2010, 11:53 PM As requested, with -n 512 ----- Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP apptest3 2128M 476 99 45362 3 24754 0 1008 99 66510 0 409.9 0 Latency 17071us 898ms 1012ms 30242us 34981us 672ms Version 1.96 ------Sequential Create------ --------Random Create-------- apptest3 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 512 41535 54 301576 99 3528 2 46603 62 343967 99 1942 0 ----- Just got off a con call with CA/3tera. They are going to work with us to improve these numbers. Will update it later. Going to deploy our Onapp test with our Dell EQ for now, see how it compares. Unless 3tera has made some monstrous improvements it was nothing more than a poor performing implementation of drbd. oplink 10-07-2010, 02:15 PM i am pretty sure the dell EQ boxes run active/active with the latest firmware update. lostmind 10-07-2010, 02:56 PM i am pretty sure the dell EQ boxes run active/active with the latest firmware update. Yup, but at over $40k for roughly 2tb usable space (in raid10 config) in that config... not many webhosting companies can afford that. Uncorrupted-Michael 10-08-2010, 03:36 PM Coraid performs amazingly well and is incredibly cost effective. sailor 10-08-2010, 04:48 PM Yup, but at over $40k for roughly 2tb usable space (in raid10 config) in that config... not many webhosting companies can afford that. its not that much but find a provider that has the good stuff deployed in large quantities and you can get it for a monthly fee that will be a much better payback than doing a capex on a smaller scale for your needs. Going cheap on your san can have dire consequences. JordanJ 10-08-2010, 05:46 PM That is really cheap actually for horizontally scalable enterprise storage. The NetAPPs we deploy cost SIGNIFICANTLY more than that. Truth is, you get what you pay for, but alot of what you pay for can only be used by larger enterprises utilizing other linked products. I am looking at a few boxes of memory storage that cost ~100k for 4TB with 250k IOPS. To give you an idea, that 3u box is the same IOPs as 1600+ sata II disks. Now-days you buy IOPs as well as GBs. nj85 10-10-2010, 10:10 AM What do you guys say about a Dell MD3000i as a SAN storage for OnApp? arisythila 10-10-2010, 08:46 PM I believe Justin is trying to setup a conference call between you and I. We've over came these issues and are achieving closer to 300-400mb/sec per SERVER. on our Grids. This was one of our major problems with Applogic. Since we've figured out how to use applogic, this hasn't been an issue at all for us. Thanks, As requested, with -n 512 ----- Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP apptest3 2128M 476 99 45362 3 24754 0 1008 99 66510 0 409.9 0 Latency 17071us 898ms 1012ms 30242us 34981us 672ms Version 1.96 ------Sequential Create------ --------Random Create-------- apptest3 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 512 41535 54 301576 99 3528 2 46603 62 343967 99 1942 0 ----- Just got off a con call with CA/3tera. They are going to work with us to improve these numbers. Will update it later. Going to deploy our Onapp test with our Dell EQ for now, see how it compares. brentpresley 10-11-2010, 10:13 PM Unless 3tera has made some monstrous improvements it was nothing more than a poor performing implementation of drbd. They have made huge strides in disk I/O performance. :D We are only limited by the RAID or SAN implementation we deploy now. KDAWebServices 10-12-2010, 03:02 AM Suppose I really should try the latest if they've got it all together now. brentpresley 10-12-2010, 03:52 AM I guess I should have qualified that statement above, lol. Our disk I/O is limited by 2 things: 1) the RAID or SAN implementation. 2) the "private" network switch speed used to keep the servers mirrored. If you are on an exceptionally slow switch, then you will see performance suffer. Most of our clouds "effectively" max out around 80MB/s due to GbE switches used. As the clouds grow and backbone traffic increases exponentially, we expect to necessitate the upgrade to faster switches. arisythila 10-12-2010, 02:02 PM When we hook 10GbE switches to our servers we actually see a full 200-300MB/sec. All depends on how you have your volumes setup tho. If primary volume is on srv1, and your running application on srv1, you do not have to facilitate these requirements. One way we sort of got around this was this was to do this. constrain a given VM to a given server. One thing we've also managed to do is to write a script that will fine your heaviest used volumes, and equally spread them onto different servers. We've used OnAPP, and VMWare. I personally like having multiple sans over one san just for that one reason of having more Disk IO's and not being able to really saturate a single back plane. choose your weapon pretty much, What works for you. Thanks, jayglate 10-12-2010, 02:34 PM When we hook 10GbE switches to our servers we actually see a full 200-300MB/sec. All depends on how you have your volumes setup tho. If primary volume is on srv1, and your running application on srv1, you do not have to facilitate these requirements. Thanks, What type of SAN are you seeing 200 to 300MB/sec on? arisythila 10-12-2010, 02:57 PM Applogics method of SAN's array, It allows us to create a SAN's using multiple machines. so if we have 10 machines, with 1TB of space in each, we have 10TB of space across the whole Grid. We are able to pull 200-300MB/sec from it tho. Only with a 10GbE backplane (Actually 40GbE Infiniband). What I like about Applogics Method is were able to do 200-300MB/sec from EACH machine we add in there. So if we have 10 machines, we can essentially do 2000-3000MB/sec. Depending how the volumes are laid across the servers. If all of the volumes are running one 1 machine, max we can get is 200-300MB/sec. Example. If we have 10 volumes on srv1, max we can get is 200-300MB/sec, If we have 1 volume on each 10 servers, we can get theoretically 2000-3000MB/sec. Were not dictated by a single back plane. We are only limited by our hardware we use. The tests we ran were with 3 1TB drives in Raid 0. (mind you Applogic mirrors your volumes across two physical hardware nodes so Dataloss doesn't happen if a server hard drive fails.) We find that this is the best method for the best for Applogic to get the most bang for your buck. We even have a full scalable cPanel cluster, and a fully scalable Plesk cluster. Thanks, ewitte 10-14-2010, 05:38 PM The more time I spend thinking about the OnApp SAN the more I see why these companies charge these crazy prices. Trying to design a high iops, redundant system for $6-8k going to take my entire 6 months of planning. KDAWebServices 10-15-2010, 11:32 AM (mind you Applogic mirrors your volumes across two physical hardware nodes so Dataloss doesn't happen if a server hard drive fails.) Not strictly true, it can happen - we've seen instances of split volumes, where the VPS was running from one half of the volume mirror - but not the other half, then when a volume check was carried out it reported an error, a repair was done and instead of re-building the mirror from the previously mounted live half it did it from the un-mounted half thus resulting in data loss. Whilst it shouldn't happen, it can and does - you need to keep a very careful eye on it. brentpresley 10-15-2010, 11:34 AM Which version of AppLogic? We saw similar errors on 2.7.8, but haven't had a problem with 2.8.9 or 2.9.3. arisythila 10-15-2010, 11:45 AM Hey Karl, We haven't seen this issue since 2.7.8. This happens because sometimes the volume gets stuck. Only way to fix it to sometimes shut the volume down, and start the repair. We haven't seen this issue for the last .... 6 months? We upgraded to 2.8.9 as soon as we could. ~Michael sailor 10-15-2010, 12:27 PM The more time I spend thinking about the OnApp SAN the more I see why these companies charge these crazy prices. Trying to design a high iops, redundant system for $6-8k going to take my entire 6 months of planning. yeah its not cheap. you can certainly get something like that but its not going to have much capacity scalability or management features. you could build something on gluster on a white box hardware with 4 sata drives and a single raid controller on each box - might be able to go to 8 drives. you wont be able to manage all your boxes from one interface and you wont be able to add to the clusster - it will be a lot of little 2 box deployments iirc. thats about all I know that will get you going in that price range. ewitte 10-15-2010, 12:32 PM yeah its not cheap. you can certainly get something like that but its not going to have much capacity scalability or management features. you could build something on gluster on a white box hardware with 4 sata drives and a single raid controller on each box - might be able to go to 8 drives. you wont be able to manage all your boxes from one interface and you wont be able to add to the clusster - it will be a lot of little 2 box deployments iirc. thats about all I know that will get you going in that price range. Thought train is headed over to 4 box nexentaStor cluster using the LSI SAS switch, 20 500GB constellation drives and SSD caching. Thing with NexentaStore is more can be added fairly easily. The entire project can scale to $15-20k but I need at least 3 5620 hypervisors as well as other hardware. sailor 10-16-2010, 08:37 AM Thought train is headed over to 4 box nexentaStor cluster using the LSI SAS switch, 20 500GB constellation drives and SSD caching. Thing with NexentaStore is more can be added fairly easily. The entire project can scale to $15-20k but I need at least 3 5620 hypervisors as well as other hardware. A friend of mine who is running that and we looked at it - they seem to be happy with it. its not cheap though. jayglate 10-16-2010, 10:25 AM Thought train is headed over to 4 box nexentaStor cluster using the LSI SAS switch, 20 500GB constellation drives and SSD caching. Thing with NexentaStore is more can be added fairly easily. The entire project can scale to $15-20k but I need at least 3 5620 hypervisors as well as other hardware. For a small build out like that, you are better off renting space on someone else's SAN. FHDave 10-16-2010, 10:38 AM For a small build out like that, you are better off renting space on someone else's SAN. 10 TB of raw space is not really small, is it? ewitte 10-16-2010, 11:43 AM For a small build out like that, you are better off renting space on someone else's SAN. If I wanted something slow I could build it for less than $1k ;) Especially considering I already have 3 450GB 15k sas drives for testing. chennaihomie 10-16-2010, 12:34 PM If I wanted something slow I could build it for less than $1k ;) Especially considering I already have 3 450GB 15k sas drives for testing. If you are getting shared SAN, it doesn't have to be slow :) Its a good solution when you are running on a tight budget. ewitte 10-16-2010, 01:07 PM If you are getting shared SAN, it doesn't have to be slow :) Its a good solution when you are running on a tight budget. How? Anyone willing to run Infiniband to my cabinet? I want a minimum of 1GB/s (800Mbit) and 10-100k iops on a single connection. 10Gbit ethernet is too expensive and having 10 or so gigabit ehternet connections per server is kinda silly and most likely wouldn't work the way intended anyway. jayglate 10-16-2010, 09:09 PM How? Anyone willing to run Infiniband to my cabinet? I want a minimum of 1GB/s (800Mbit) and 10-100k iops on a single connection. 10Gbit ethernet is too expensive and having 10 or so gigabit ehternet connections per server is kinda silly and most likely wouldn't work the way intended anyway. 10 to 100k IOPs not so hard, lots of drives yes, or fusion io, infiniband, umm maybe.. LOL but for 1GB/s (800Mbit) can easily be achieved via several bonded connections over NFS4 or iSCSI if you can run NFS4 I would suggest it as NFS4 is worlds better than NFS3 and it has some amazing local read caching that happens and is generally easier than iSCSI (nfs3 run from, very very fast) Now if you are looking for 800Mbit to one single server I think you need some real world examples to justify that, some of the very very largest and heaviest hit cloud providers who have alot more money than we do, never hit even under a VERY VERY VERY heavy load on their hypervisors anywhere near 800Mbit to a single host. 300 to 500Mbit is more reasonable to a hypervisor but 800Mbit from the SAN EASILY achievable with infiband. sailor 10-16-2010, 09:37 PM 10 TB of raw space is not really small, is it? its not small but its not large. its not big enough to justify their own clustered solution and the support overhead that goes with it. thats only 2.5 useable in a raid 100 format. FHDave 10-16-2010, 10:14 PM raid 100, 1 GB/s=800 Mbps. I think we all need some rest :) lostmind 10-17-2010, 12:57 AM its not that much but find a provider that has the good stuff deployed in large quantities and you can get it for a monthly fee that will be a much better payback than doing a capex on a smaller scale for your needs. Going cheap on your san can have dire consequences. Hey Sailor, I'm not sure I follow your comment? For an active-active pair of dell eq's, the lowest-end config (while still having redundant cards, etc) of 16 x 250gb sata drives is right around $20k each. That's $40k for the pair? List price is closer to $60k. Over $60k with taxes. Maybe if we purchased more than one or two at once we could get a deeper discount, but I am not sure it would go down much more than it already has... To switch my hardware costs from capex to opex, I could always go the leasing route, although that doesn't always make the most sense for my company. Thus I outright own quite a bit of hardware and only lease a small portion. I've not found a colo company willing to rent me gear of any sort cheaper than going to the source myself. Even though large companies (colo providers for example) can likely negotiate better hardware pricing from their suppliers than smaller fish such as myself can due to their quantity, once you factor in the colo provider's profit margins that hardware savings evaporate. We also get the benefits of full control & flexibility. lostmind 10-17-2010, 01:03 AM That is really cheap actually for horizontally scalable enterprise storage. The NetAPPs we deploy cost SIGNIFICANTLY more than that. Truth is, you get what you pay for, but alot of what you pay for can only be used by larger enterprises utilizing other linked products. I am looking at a few boxes of memory storage that cost ~100k for 4TB with 250k IOPS. To give you an idea, that 3u box is the same IOPs as 1600+ sata II disks. Now-days you buy IOPs as well as GBs. Jordan, I'm curious what Phoenixnap needs ramsans and netapps for (aren't you strictly colo)? I'd love to hear your experience with them. In my case, it's not that I feel the cost is too high for the feature set you get with an enterprise SAN, but that the cost is too high for general use cloud hosting platforms. Maybe if you have a high end niche market willing to pay significantly higher pricing for the end products... lostmind 10-17-2010, 01:05 AM I believe Justin is trying to setup a conference call between you and I. We've over came these issues and are achieving closer to 300-400mb/sec per SERVER. on our Grids. This was one of our major problems with Applogic. Since we've figured out how to use applogic, this hasn't been an issue at all for us. Thanks, Hey Mike, Got your emails, appreciate the feedback. Will be in touch. lostmind 10-17-2010, 01:11 AM For a small build out like that, you are better off renting space on someone else's SAN. When renting space on someone's SAN and you run into performance issues, you generally are told "so sorry, you're renting space on a shared SAN, what do you expect?" Sometimes, building your own is the only way to go. jayglate 10-17-2010, 01:18 AM When renting space on someone's SAN and you run into performance issues, you generally are told "so sorry, you're renting space on a shared SAN, what do you expect?" Sometimes, building your own is the only way to go. not if you clearly outline how much storage and what type of performance you are looking to achieve. sailor 10-17-2010, 02:48 PM I am talking about if you rent it from a provider on their san. they are going to have a much lower cost of operations on scale. Yes you will get more control which can be a benefit - but then yes you will get more control which can be a drawback and you will have to have guys to support it which are not cheap either. You cant do everything unless you don't want a life outside of work and want to be on call 24x7 with calls actually coming in. There are pluses and minuses to everything. Every minute you spend on something you decide to insource to save 1$ is a minute you don't get back to focus on your core value proposition which might be earning you 5$. that is an unwise investment which all too often too many people engage in because they don't do a full analysis of the soft or hidden expense. Hey Sailor, I'm not sure I follow your comment? For an active-active pair of dell eq's, the lowest-end config (while still having redundant cards, etc) of 16 x 250gb sata drives is right around $20k each. That's $40k for the pair? List price is closer to $60k. Over $60k with taxes. Maybe if we purchased more than one or two at once we could get a deeper discount, but I am not sure it would go down much more than it already has... To switch my hardware costs from capex to opex, I could always go the leasing route, although that doesn't always make the most sense for my company. Thus I outright own quite a bit of hardware and only lease a small portion. I've not found a colo company willing to rent me gear of any sort cheaper than going to the source myself. Even though large companies (colo providers for example) can likely negotiate better hardware pricing from their suppliers than smaller fish such as myself can due to their quantity, once you factor in the colo provider's profit margins that hardware savings evaporate. We also get the benefits of full control & flexibility. FHDave 10-17-2010, 03:20 PM not if you clearly outline how much storage and what type of performance you are looking to achieve. What utility do you use to guarantee that somebody can't exceed their assigned IOPS or performance (whatever metrics that is). jayglate 10-17-2010, 05:54 PM What utility do you use to guarantee that somebody can't exceed their assigned IOPS or performance (whatever metrics that is). I wouldn't call it a utility but we can build a dedicated allocation within a shared SAN to meet a customers guidelines and needs. eming 10-17-2010, 07:27 PM I actually agree with Jay and Jeff here (and to get back to the OP's Q: "SAN for OnApp"), that it would make sense for a large portion of the hosts on WHT to go with a shared SAN solution for their OnApp setup. Right now margins are very high on cloud hosting, and there is plenty of room for the added long-term cost of a leased SAN. The cloud WILL commoditize in the next 12-18 months, and if you do not have a footprint by then, you may have hard time getting one. So, building your own cloud infrastructure and cloud software platform might just be bad business as you would loose out on the pre-price-erosion-cloud-era - it is a gold rush right now, make sure you get started before it is too late. And going with an existing infrastructure will make it a whole lot easier for you. :) D JordanJ 11-05-2010, 02:27 PM Jordan, I'm curious what Phoenixnap needs ramsans and netapps for (aren't you strictly colo)? I'd love to hear your experience with them. In my case, it's not that I feel the cost is too high for the feature set you get with an enterprise SAN, but that the cost is too high for general use cloud hosting platforms. Maybe if you have a high end niche market willing to pay significantly higher pricing for the end products... We are working on some product sets to allow enterprises with existing netapps to utilize the snap mirror functionality and backup to a secure facility without having to buy an additional netapp. Also, small enterprises needing centralized storage under 5TB can realize a huge cost savings by renting the storage via a datacenter rather than purchase a full netapp with only one shelf. One of our customers who I consult for is also using it for an onApp deployment. |