Page 1 of 3 123 LastLast
Results 1 to 40 of 86
  1. #1
    Join Date
    Jul 2005
    Location
    Australia
    Posts
    117

    ARP limits on the MLX chassis

    There's been a number of threads talking about the MLX router chassis' and the fact that the published specs on Brocade gear leaves a lot to be desired.

    I've got a first hand report of 3% cpu at 6k ARP on the MLX, but we also know on the FESX switches that it doesn't scale linearly and that there is an effective ARP "barrier" at some point where the MLX is likely to just die in a fire.

    I'm just wondering if there is anyone currently using an MLX in their network, and what level of cpu usage they're seeing and at what ARP table size so that we can get some idea of a realistic and reliable maximum.
    IOFLOOD.com -- We Love Servers
    High ram servers with lots of IPs back in stock.
    Email: sales [at] ioflood.com

  2. #2
    Join Date
    Aug 2006
    Location
    Ashburn VA, San Diego CA
    Posts
    4,571
    Might want to post this on f-nsp.
    Fast Serv Networks, LLC | AS29889 | Fully Managed Cloud, Streaming, Dedicated Servers, Colo by-the-U
    Since 2003 - Ashburn VA + San Diego CA Datacenters

  3. #3
    Join Date
    Dec 2009
    Posts
    2,130

    Re: ARP limits on the MLX chassis

    Just curious.. why don't you architect your network in a way that eliminates these giant arp tables at the "core?"
    Redundant.com High Availability and High Performance Solutions
    Dedicated Servers and VMware Cloud Hosting in Dallas and Los Angeles
    Diverse connections to the Internap Performance Optimized Network

  4. #4
    Join Date
    Jan 2003
    Location
    Chicago, IL
    Posts
    6,889
    Quote Originally Posted by Ionity View Post
    Just curious.. why don't you architect your network in a way that eliminates these giant arp tables at the "core?"
    6k is giant?

    And in a typical network configuration the MLX's could often be used in a distribution layer, and a distribution layer is where you'd expect the largest ARP tables to be. If these were being used as a "core" router, then I'd agree, it should be all Layer 3 by that point.
    Karl Zimmerman - Steadfast: Managed Dedicated Servers and Premium Colocation
    karl @ steadfast.net - Sales/Support: 312-602-2689
    Cloud Hosting, Managed Dedicated Servers, Chicago Colocation, and New Jersey Colocation
    Now Open in New Jersey! - Contact us for New Jersey colocation or dedicated servers

  5. #5
    Join Date
    Jan 2010
    Posts
    308
    Quote Originally Posted by Ionity View Post
    Just curious.. why don't you architect your network in a way that eliminates these giant arp tables at the "core?"
    Since when is 6k ARP entries considered giant?

  6. #6
    Join Date
    Dec 2009
    Posts
    2,130
    Quote Originally Posted by KarlZimmer View Post
    6k is giant?

    And in a typical network configuration the MLX's could often be used in a distribution layer, and a distribution layer is where you'd expect the largest ARP tables to be. If these were being used as a "core" router, then I'd agree, it should be all Layer 3 by that point.
    No 6k isn't giant. But trying to find the upper limits and push up towards that turns into the wrong network architecture.
    Redundant.com High Availability and High Performance Solutions
    Dedicated Servers and VMware Cloud Hosting in Dallas and Los Angeles
    Diverse connections to the Internap Performance Optimized Network

  7. #7
    Join Date
    Mar 2003
    Location
    New Jersey
    Posts
    1,277

  8. #8
    Quote Originally Posted by amc-james View Post
    Code:
    System Parameters      Default    Maximum    Current    Actual     Bootup     Revertible
    ip-arp                             8192       65536      8192       8192       8192       No
    The maximum is 65536 on this MLX8 with NI-MLX-MR management module
    The maximum is also 65k on the FESX448, but it falls over dead from cpu exhaustion around 4500. We're setting up a lab to hammer IP usage on the device to see how it holds up to a 20k+ arp table, which is what we're going to need for this device to make sense anywhere in our network -- core or distribution level.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  9. #9
    Join Date
    Aug 2004
    Location
    Dallas, TX
    Posts
    3,507
    Quote Originally Posted by funkywizard View Post
    The maximum is also 65k on the FESX448, but it falls over dead from cpu exhaustion around 4500. We're setting up a lab to hammer IP usage on the device to see how it holds up to a 20k+ arp table, which is what we're going to need for this device to make sense anywhere in our network -- core or distribution level.
    I have 14 ARP entires on my distribution switches (uplinks to each other and uplinks to routers). I believe we work with a few more IPs than you. As Alec pointed out, I also suspect you're doing something very wrong with your network design. Customer VLANs can be established on the edge switches and bridged to other devices if needed for portability etc.
    Dallas Colocation by Incero, 8 years and counting!
    e: sales(at)incero(dot)com 855.217.COLO (2656)
    Colocation & Enterprise Servers, SATA/SAS/SSD, secure IPMI/KVM remote control, 100% U.S.A. Based Staff
    SSAE 16, SAS70, Redundant Power & Network, Fully Diverse Fiber

  10. #10
    Quote Originally Posted by gordonrp View Post
    Customer VLANs can be established on the edge switches and bridged to other devices if needed for portability etc.
    The problem being that if you want portability, and you have the gateway live on the edge switch, then you have a lot of network traffic ping-pong on the network.

    So you have two servers, and they're for VPS hosting providers, so they want to share IPs, say, 2 /26's. You put the gateway on the top of rack switch connected to one server, the other server is on a different top of rack switch, so any time the other server has to reach it's gateway, it has to go:

    Server2 -> TOR2 -> distribution1 -> core -> distribution2 -> TOR1 -> distribution2 -> core -> internet

    That's not very efficient, and, the number of single points of failure there is insane unless you want to trust spanning tree protocol for layer2 failover, and if you get a ddos to server 2, that traffic ping-pongs between two separate top of rack switches and two distribution switches, doubling the traffic along the way AND affecting twice as much of your network.

    Obviously it's preferable to do your routing on the distribution or core layer IF you need portability, but obviously it's undesireable to put a lot of load onto your distribution or core if you can help it. At this point, you're right, your network is a lot larger than ours, so solutions that will work for us will not necessarily work for you. The question becomes, will the MLX support a 20k ARP table (meaning we can get by with all the gateways living on what will currently be the core device, and, in the future, what will eventually be our distribution device), or, is the max reasonable arp table for an MLX more in the 10k range, in which case we have no choice but to have our gateways "live" on the top of rack switches?

    We're going to do some benchmarking to see how the device holds up with a large arp table before we put it into production, but of course there's no substitute for real world experience if anyone has any to share.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  11. #11
    Join Date
    Aug 2004
    Location
    Dallas, TX
    Posts
    3,507
    Interconnect switches as well as going to distribution using ospf, or ibgp. No ping pong needed. You then also end up with a more redundant network as if optic from switch1 to distribution/core fails (pretending you're not doing lags), then you also have another route from switch1 via switch2 to distribution/code. Regardless, I doubt all your customers need vlan/ip bridging. For us it has to be less than 5% of customers. Get your vlans off your core = sleep more
    Last edited by ServiceProvider; 10-23-2013 at 08:37 PM.
    Dallas Colocation by Incero, 8 years and counting!
    e: sales(at)incero(dot)com 855.217.COLO (2656)
    Colocation & Enterprise Servers, SATA/SAS/SSD, secure IPMI/KVM remote control, 100% U.S.A. Based Staff
    SSAE 16, SAS70, Redundant Power & Network, Fully Diverse Fiber

  12. #12
    Quote Originally Posted by gordonrp View Post
    Interconnect switches as well as going to distribution using ospf, or ibgp. No ping pong needed. You then also end up with a more redundant network as if optic from switch1 to distribution/core fails (pretending you're not doing lags), then you also have another route from switch1 via switch2 to distribution/code. Regardless, I doubt all your customers need vlan/ip bridging. For us it has to be less than 5% of customers. Get your vlans off your core = sleep more
    Each server needs to be able to talk to the default gateway for the IP block it's trying to use. It needs to talk to that default gateway on layer 2, sending traffic to that gateway's mac address. If you want customers on two different switches to be able to share the same IP block, you need a layer 2 path between the two switches, because the gateway lives on one switch or the other, and not both, but the IP use is on both. This means you can't take advantage of layer 3 redundancy protocols like ospf / ibgp you've suggested.

    You cannot have more than 1 layer2 path between two devices or you will have all manner of issues (broadcast storms, mac addresses appearing on two ports on the same switch, so you have mac table flip flopping and brownouts, the list goes on). So no, you can't just cross connect the switches all to each other *and* to the distribution layer and expect it all to work and avoid ping ponging the traffic.

    If you want IP portability between servers on different top of rack switches, you either need to put up with traffic ping pong, or you have to put your gateways / VE's at the distribution or core layer. You really can't have it both ways.

    edit: as to IP bridging / sharing, we target VPS hosting providers so a higher percentage of our customers need this. More important than the percentage of customers who need this, is the percentage of IP addresses that need this, since the VPS hosts use dramatically more IPs than any other kind of customer. If I took all the customers with only one server and put their VE's on their top of rack, I'd be lucky if that accounted for a third of our IP address usage.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  13. #13
    Join Date
    Oct 2002
    Location
    Vancouver, B.C.
    Posts
    2,656
    Either
    1) Just connect servers that need to share subnets to the same TOR switch. If you can't have the servers in the same rack, just run a cable over from another cabinet. At the very least, put them on TOR switches going to the same distribution switch.
    or
    2) Don't allow sharing subnets across multiple VPS hosting servers.
    or
    3) Spend the $$$ on a network architecture that can handle traffic going back and forth, and that doesn't have such single points of failure. Efficiency isn't as much of a concern if you have massive capacity well beyond what you run at.

    Personally, I'd advocate running layer3 at the access layer when you can, and at the distribution layer in the cases where you have to deal with sharing subnets and such. Having all your layer3 at the core is not very scalable, as you've already discovered. There's a reason why most new switches are layer 3 cable.
    ASTUTE HOSTING: Advanced, customized, and scalable solutions with AS54527 Premium Canadian Optimized Network (Level3, PEER1, Shaw, Tinet)
    MicroServers.io: Enterprise Dedicated Hardware with IPMI at VPS-like Prices using AS63213 Affordable Bandwidth (Cogent, HE, Tinet)
    Dedicated Hosting, Colo, Bandwidth, and Fiber out of Vancouver, Seattle, LA, Toronto, NYC, and Miami

  14. #14
    Join Date
    Oct 2002
    Location
    Vancouver, B.C.
    Posts
    2,656
    Quote Originally Posted by funkywizard View Post
    You cannot have more than 1 layer2 path between two devices or you will have all manner of issues (broadcast storms, mac addresses appearing on two ports on the same switch, so you have mac table flip flopping and brownouts, the list goes on). So no, you can't just cross connect the switches all to each other *and* to the distribution layer and expect it all to work and avoid ping ponging the traffic.
    There's something called spanning tree, which actually works quite well when you do it right.
    ASTUTE HOSTING: Advanced, customized, and scalable solutions with AS54527 Premium Canadian Optimized Network (Level3, PEER1, Shaw, Tinet)
    MicroServers.io: Enterprise Dedicated Hardware with IPMI at VPS-like Prices using AS63213 Affordable Bandwidth (Cogent, HE, Tinet)
    Dedicated Hosting, Colo, Bandwidth, and Fiber out of Vancouver, Seattle, LA, Toronto, NYC, and Miami

  15. #15
    Quote Originally Posted by hhw View Post
    There's something called spanning tree, which actually works quite well when you do it right.
    Spanning tree disables one path or the other. There would be two layer2 paths in terms of, if one failed the other could take over. But two would not be active simultaneously, and you'd still have traffic ping pong if the gateway were on a TOR that was not the same TOR that the server needing that gateway was located on. And that's assuming STP is even working correctly. I honestly don't trust STP.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  16. #16
    Join Date
    Aug 2004
    Location
    Dallas, TX
    Posts
    3,507
    hehe, I'm not going to keep beating a dead horse Good luck.
    Dallas Colocation by Incero, 8 years and counting!
    e: sales(at)incero(dot)com 855.217.COLO (2656)
    Colocation & Enterprise Servers, SATA/SAS/SSD, secure IPMI/KVM remote control, 100% U.S.A. Based Staff
    SSAE 16, SAS70, Redundant Power & Network, Fully Diverse Fiber

  17. #17
    Quote Originally Posted by hhw View Post
    Either
    1) Just connect servers that need to share subnets to the same TOR switch. If you can't have the servers in the same rack, just run a cable over from another cabinet. At the very least, put them on TOR switches going to the same distribution switch.
    or
    2) Don't allow sharing subnets across multiple VPS hosting servers.
    or
    3) Spend the $$$ on a network architecture that can handle traffic going back and forth, and that doesn't have such single points of failure. Efficiency isn't as much of a concern if you have massive capacity well beyond what you run at.

    Personally, I'd advocate running layer3 at the access layer when you can, and at the distribution layer in the cases where you have to deal with sharing subnets and such. Having all your layer3 at the core is not very scalable, as you've already discovered. There's a reason why most new switches are layer 3 cable.
    I agree these are the 3 possibilities you can work with.

    Option 1) leads to a ton of cables going all over the place, or needing to move servers around depending on cancellations and orders from existing customers. We've seriously considered this option, and may need to rely on it at least to some degree.

    Option 2) is clearly the popular option for hosting providers larger than us, who are facing the same problems we're currently facing. We call this option "telling the customer to F themselves". Naturally we consider that a last resort.

    Option 3 sounds good in theory, yes, you can avoid the bandwidth considerations of the ping ponging if you just throw bandwidth at the problem. But the underlying issue that you magnify the consequences of network failures is still there. You make the network less reliable if traffic is traveling back and forth between devices that it doesn't need to. This is especially true at layer2, where the only failover protocols you have to work with is spanning tree, which is honestly a terrible protocol.

    The only other option is having a shared device that can handle all of the ARP for a group of customers, and that group of customers is physically connected to that device. So you could say, ok, I put the ARP at distribution and I have a distribution device with 20k ARP entries, and if you want to share IPs you need to be physically connected to the same distribution device.

    That's not really much different than the "must share top of rack" solution except that the capacity of the distribution device needs to be proportionally larger. It does have the advantage of being more realistic. Making sure all of a customer's servers are on a network segment supporting 500 servers is a lot more straightforward than making sure all of a given customer's servers are on a network segment supporting 40 servers.

    The original question still stands: will the MLX support a 20k arp, and therefore allow us to have a "segment" 500 servers large, or do we have to make a really stupid tradeoff of having each segment of our network support a maximum of 40 servers, which in turn means it's not practical to allow customers to share IPs between servers?
    Last edited by funkywizard; 10-23-2013 at 09:17 PM.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  18. #18
    Quote Originally Posted by gordonrp View Post
    hehe, I'm not going to keep beating a dead horse Good luck.
    Well, it's a tough technical problem, otherwise there would be no point in vigorous debate. Obviously there's no simple solution that gives you everything you would possibly want, so we're working out which tradeoffs make the most sense. If the MLX supports a 20k arp table, putting all the gateways on that device is the correct tradeoff for us. If it doesn't, some other configuration would make more sense. I wouldn't call that discussion to be beating a dead horse.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  19. #19
    Join Date
    Dec 2009
    Posts
    2,130
    I would think potential customers would consider some of these discussions red flags??

    Anyhow...

    You know you could always run layer3 to the access switches, and in the scenario where a customer requires IP portability run those as layer2 VLAN's back to the "core" device.
    Redundant.com High Availability and High Performance Solutions
    Dedicated Servers and VMware Cloud Hosting in Dallas and Los Angeles
    Diverse connections to the Internap Performance Optimized Network

  20. #20
    Quote Originally Posted by Ionity View Post
    I would think potential customers would consider some of these discussions red flags??

    Anyhow...

    You know you could always run layer3 to the access switches, and in the scenario where a customer requires IP portability run those as layer2 VLAN's back to the "core" device.
    If a customer chooses us because they have the wrong idea about who we are and what we offer, we don't deserve that business. I think its more honest and transparent to be willing to talk in public about these things. I'm sure other people have worse networks than us, and not talking about their network in public doesn't make their network better than it is.

    As to the second suggestion, that's certainly a valid strategy, but how do we find out who needs this feature? Customers appreciate when things just work, without having to answer questions they don'tunderstand or ask for things they dont know they need. So that really means assigning everything to core except customers with only one server. That doesn't offload a lot of ip space so you have much the same problem still.

    Could potentially monitor the network to see who is actually sharing Ips and who isn't, and for people who are, move the gateway to the core, and for people who don't, leave the gateway on the edge. That would get a bit more Ips off the core, but even then the core does need a bit of oomph regardless, so the problem itself is the same, just when you have the problem is at a higher level of use.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  21. #21
    Join Date
    Oct 2009
    Location
    Canada
    Posts
    482
    This would be easier to do if the current infrastructure you have not wasn't already in place, but I tend to always look at the practical solution to things which doesn't always need a flashy technical solution...

    Generally speaking (and this may be different depending on your customer base) there are always certain customers who grow quickly and need more nodes right away, and some customers who buy a bunch too fast and reduce their amount of servers.

    If you were to leave 5-8 U space per rack, a simple approach would be to just physically move nodes from rack to rack and group them together physically per customer group. Assuming all your nodes are the same chassis type and they're all tagged with some form of serial number, this would allow for layer3 on the TOR switches as much as possible.

    Most customers also recognize the added redundancy they can get if they will then be presented with an option to buy clusters in two completely separate network segments as they grow as well. So it can be a double positive.

    Never forget the elbow grease solution. I'm sure it'a gotten everyone out of a bind here more than once.

    Going back to your point about not trusting spanning tree... I'm not sure if I would be sleeping easy knowing that if one core router bricked itself over night, it would have a diverse impact vs having each switch route upwards through two distinct paths with only L3 traffic to worry about...
    Owner Media-Hosts.com AS14442 Canadian Web Hosts Since 2002
    █ 24/7 365 Support, 100% Network Up-time Guarantee
    █ Web Development Specialists (E-Commerce, Inventory, Design)
    OpenVZ.ca Reliable, Affordable VPS Servers and Web Hosting. IPv6 Available

  22. #22
    Join Date
    Dec 2009
    Posts
    2,130
    Quote Originally Posted by funkywizard View Post
    If a customer chooses us because they have the wrong idea about who we are and what we offer, we don't deserve that business. I think its more honest and transparent to be willing to talk in public about these things. I'm sure other people have worse networks than us, and not talking about their network in public doesn't make their network better than it is.

    As to the second suggestion, that's certainly a valid strategy, but how do we find out who needs this feature? Customers appreciate when things just work, without having to answer questions they don'tunderstand or ask for things they dont know they need. So that really means assigning everything to core except customers with only one server. That doesn't offload a lot of ip space so you have much the same problem still.

    Could potentially monitor the network to see who is actually sharing Ips and who isn't, and for people who are, move the gateway to the core, and for people who don't, leave the gateway on the edge. That would get a bit more Ips off the core, but even then the core does need a bit of oomph regardless, so the problem itself is the same, just when you have the problem is at a higher level of use.
    If they have a single server then it is a no brainer. If they have multiple servers ask them if they want the IP's to be portable between servers? How hard is that of a question to ask?
    Redundant.com High Availability and High Performance Solutions
    Dedicated Servers and VMware Cloud Hosting in Dallas and Los Angeles
    Diverse connections to the Internap Performance Optimized Network

  23. #23
    Quote Originally Posted by media-hosts_com View Post
    This would be easier to do if the current infrastructure you have not wasn't already in place, but I tend to always look at the practical solution to things which doesn't always need a flashy technical solution...

    Generally speaking (and this may be different depending on your customer base) there are always certain customers who grow quickly and need more nodes right away, and some customers who buy a bunch too fast and reduce their amount of servers.

    If you were to leave 5-8 U space per rack, a simple approach would be to just physically move nodes from rack to rack and group them together physically per customer group. Assuming all your nodes are the same chassis type and they're all tagged with some form of serial number, this would allow for layer3 on the TOR switches as much as possible.

    Most customers also recognize the added redundancy they can get if they will then be presented with an option to buy clusters in two completely separate network segments as they grow as well. So it can be a double positive.

    Never forget the elbow grease solution. I'm sure it'a gotten everyone out of a bind here more than once.

    Going back to your point about not trusting spanning tree... I'm not sure if I would be sleeping easy knowing that if one core router bricked itself over night, it would have a diverse impact vs having each switch route upwards through two distinct paths with only L3 traffic to worry about...
    As to the idea of leaving space spare in each rack, that's obviously an expensive solution to the problem, both in terms of labor to move things around all the time, and in terms of the spare space. Hence if the layer 3 was on a distribution level, say, every 10 racks or so, it's pretty easy to have 5u free out of 10 racks without even trying, just based on pace of cancellations, so there's no need to change cabling, and rarely would you need to move servers around. On a per-rack level, you're only dealing with 40 servers, so the amount of slack capacity you'd have to leave everywhere would get expensive, and the workload needed to continuously balance everything to make sure there's sufficient capacity for new orders without wasting space, is also too expensive.

    As to the impacts of failures and diversity and failover, STP is certainly less reliable than a routing protocol. A routing protocol will fail over the link if the full path from A->B is not working in some way. Something like spanning tree will only fail over from one link to another if a specific link has been detected as having failed, which is much less robust. There's plenty of horror stories out there about STP not working properly, or even being the cause of network failures in the first place. For now we have a "warm spare" mentality. Given that all of the failures we've seen to date have been issues with configurations or platforms or ddos or some other problem, and not with physical failures, we've never seen a problem that having spare physical hardware would have solved. Obviously as our network gets bigger, these low probability events have a much greater chance of happening, and we plan to plan our network accordingly with increasing redundancy as appropriate. Unfortunately, portable IPs and redundancy are are mutually exclusive with each other with the most effective means of redundancy which is with routing protocols like ospf and ibgp. You could possibly get by with VRRP and STP, but I think the added complexity is just as likely to cause a failure as the kinds of issues those technologies are designed to protect against.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  24. #24
    Quote Originally Posted by Ionity View Post
    If they have a single server then it is a no brainer. If they have multiple servers ask them if they want the IP's to be portable between servers? How hard is that of a question to ask?
    You'd be surprised what kind of simple questions many customers seem incapable of answering. Like, what OS do you want? Well it's on the order form right, anyone can answer that. Or if you want raid or not. How many IPs you want. It's all on the order form. But we have to go through each order we get, compare it to what that customer ordered in the past, and if it differs from what they normally order, or if the combination of specs differ from what customers normally want, ask them if they ordered what they intended to receive. You'd be surprised how often it doesn't match up. Even then, even when we ask, we might get no reply, provision the server as they ordered it, and then they'll come back to indicate that they didn't get what they wanted, because their order form didn't reflect what they wanted. It happens every day.

    Our point of view is, if we can find a way to get the customer the correct result without relying on them to give us the correct answer to a question, we need to do it that way. If we don't take that point of view about things, they won't get the right result, and they'll either be disappointed (but never complain), or they'll ask us to fix it (even though they got what they asked for in the first place). Correct-by-default is the only way to do it if you want things to run smoothly.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  25. #25
    Join Date
    Oct 2009
    Location
    Canada
    Posts
    482
    Quote Originally Posted by funkywizard View Post
    As to the idea of leaving space spare in each rack, that's obviously an expensive solution to the problem, both in terms of labor to move things around all the time, and in terms of the spare space. Hence if the layer 3 was on a distribution level, say, every 10 racks or so, it's pretty easy to have 5u free out of 10 racks without even trying, just based on pace of cancellations, so there's no need to change cabling, and rarely would you need to move servers around. On a per-rack level, you're only dealing with 40 servers, so the amount of slack capacity you'd have to leave everywhere would get expensive, and the workload needed to continuously balance everything to make sure there's sufficient capacity for new orders without wasting space, is also too expensive.

    As to the impacts of failures and diversity and failover, STP is certainly less reliable than a routing protocol. A routing protocol will fail over the link if the full path from A->B is not working in some way. Something like spanning tree will only fail over from one link to another if a specific link has been detected as having failed, which is much less robust. There's plenty of horror stories out there about STP not working properly, or even being the cause of network failures in the first place. For now we have a "warm spare" mentality. Given that all of the failures we've seen to date have been issues with configurations or platforms or ddos or some other problem, and not with physical failures, we've never seen a problem that having spare physical hardware would have solved. Obviously as our network gets bigger, these low probability events have a much greater chance of happening, and we plan to plan our network accordingly with increasing redundancy as appropriate. Unfortunately, portable IPs and redundancy are are mutually exclusive with each other with the most effective means of redundancy which is with routing protocols like ospf and ibgp. You could possibly get by with VRRP and STP, but I think the added complexity is just as likely to cause a failure as the kinds of issues those technologies are designed to protect against.
    100% in agreement to majority of these points.

    But if you have a 42U rack, with 2x 1U switches and 40 servers at 1U per, it doesn't give much options with regards to flexibility.

    Having a few extra U's available over a series of racks and some servers sitting idle (even though it doesn't bring in money) can be a good thing and is very heard to measure from a monetary perspective.

    For example:
    Having extra backup chassis per rack makes it easy to migrate a customer if their hardware fails. Fixing this problem quickly because there's physical space in the rack with room to work and or spare/idle servers in the rack (within reason) have a direct correlation to your reputation because the less waiting time the client has, the more likely they are to refer you. It's like an advertising budget on your colocation expense line of your income statement.

    Saying each rack absolutely must be 100% fully utilized from a cost savings perspective can sometimes cost more down the road by not being able to acheive the above result in a reasonable amount of time due to constraints (physical, inventory etc...).

    The only other way to do it and fully utilize 100% of all racks all the time with excessive capacity is to get a solution like qfabric deployed (which costs boatloads of money). Or get a big expensive chassis based redundant core switch that everything plugs in to like an EX8k series with 100k+ ARP entries and redundant re's, psu's etc.. and re-architect based on these needs. Then you can be 100% certain ARP's won't be the issue at hand.

    Best you can do is give your clients the best value for the price they pay. Better do it right and support growth now vs doing it in stages where you're constantly buying hardware to keep up.
    Owner Media-Hosts.com AS14442 Canadian Web Hosts Since 2002
    █ 24/7 365 Support, 100% Network Up-time Guarantee
    █ Web Development Specialists (E-Commerce, Inventory, Design)
    OpenVZ.ca Reliable, Affordable VPS Servers and Web Hosting. IPv6 Available

  26. #26
    Join Date
    Dec 2009
    Posts
    2,130
    Quote Originally Posted by funkywizard View Post
    You'd be surprised what kind of simple questions many customers seem incapable of answering. Like, what OS do you want? Well it's on the order form right, anyone can answer that. Or if you want raid or not. How many IPs you want. It's all on the order form. But we have to go through each order we get, compare it to what that customer ordered in the past, and if it differs from what they normally order, or if the combination of specs differ from what customers normally want, ask them if they ordered what they intended to receive. You'd be surprised how often it doesn't match up. Even then, even when we ask, we might get no reply, provision the server as they ordered it, and then they'll come back to indicate that they didn't get what they wanted, because their order form didn't reflect what they wanted. It happens every day.

    Our point of view is, if we can find a way to get the customer the correct result without relying on them to give us the correct answer to a question, we need to do it that way. If we don't take that point of view about things, they won't get the right result, and they'll either be disappointed (but never complain), or they'll ask us to fix it (even though they got what they asked for in the first place). Correct-by-default is the only way to do it if you want things to run smoothly.
    To me this is a bunch of blah blah blah.


    I really have zero desire to get into how to manage customer expectations, etc.

    I can say I very rarely ever have issues such as this when it comes to communicating customer needs. Maybe we have a different target user, I'm not sure.

    None of that really matters though. How hard is it to attach their network(s) to a local port on the access switch by default and move it to a VLAN interface if they need that? Just takes a couple lines of config changes.
    Redundant.com High Availability and High Performance Solutions
    Dedicated Servers and VMware Cloud Hosting in Dallas and Los Angeles
    Diverse connections to the Internap Performance Optimized Network

  27. #27
    Join Date
    Aug 2006
    Location
    Ashburn VA, San Diego CA
    Posts
    4,571
    I think you'll probably be fine with the MLX but post it on f-nsp to be sure. Folks around there have alot of experience with these chassis and may shed some ARP numbers. You might be making a mountain out of a mole hill... I've seen aging C65k's do 30k+ with ease. Including Rapid PVST+ and hundreds of VLANs.
    Fast Serv Networks, LLC | AS29889 | Fully Managed Cloud, Streaming, Dedicated Servers, Colo by-the-U
    Since 2003 - Ashburn VA + San Diego CA Datacenters

  28. #28
    Quote Originally Posted by Ionity View Post
    To me this is a bunch of blah blah blah.
    I would totally disagree. Getting the customer the right result without forcing them to think about things they don't need to think about, or make decisions they don't need to make is incredibly important. "It just needs to work" is the customer's point of view, and our job is to take something complex and make it "just work". Sure, this might be one small thing to ask them about, but on top of how many other small things? The people who give up on your sales process because of too many questions they don't know how to answer is not clear, because they don't become your customers in the first place, but it absolutely happens.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  29. #29
    Quote Originally Posted by FastServ View Post
    I think you'll probably be fine with the MLX but post it on f-nsp to be sure. Folks around there have alot of experience with these chassis and may shed some ARP numbers. You might be making a mountain out of a mole hill... I've seen aging C65k's do 30k+ with ease. Including Rapid PVST+ and hundreds of VLANs.
    Thanks for the information, I'll try to check out f-nsp to see what they have to say there.

    In our testing so far with iperf and a ton of 10.x.x.x IPs, we've found the management cpu holds up pretty well, but the linecard cpu doesn't. The linecard seems to go from 1% to suddenly 55% and just stay there, no in between, not going over 55, not being anywhere between 1 and 55. The linecard cpu stats don't tell you what exactly the cpu is doing either, so it's kind of a black box.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  30. #30
    Quote Originally Posted by funkywizard View Post
    Thanks for the information, I'll try to check out f-nsp to see what they have to say there.

    In our testing so far with iperf and a ton of 10.x.x.x IPs, we've found the management cpu holds up pretty well, but the linecard cpu doesn't. The linecard seems to go from 1% to suddenly 55% and just stay there, no in between, not going over 55, not being anywhere between 1 and 55. The linecard cpu stats don't tell you what exactly the cpu is doing either, so it's kind of a black box.
    f-nsp has already been helpful on this particular issue:

    http://puck.nether.net/pipermail/fou...er/000784.html

    In our testing, we have all traffic coming in a 10g port, and going out the same 10g port, to reach our servers. Apparently this causes problems insofar as, at least back when this post was made, it will cause linecard cpu use by routing in software.

    I'll see if I can get the input and outputs on different ports to see if this is causing my linecard cpu use.

    edit: apparently you can work around the issue by doing "no ip icmp redirect". So trying that now.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  31. #31
    Quote Originally Posted by media-hosts_com View Post
    100% in agreement to majority of these points.

    But if you have a 42U rack, with 2x 1U switches and 40 servers at 1U per, it doesn't give much options with regards to flexibility.

    Having a few extra U's available over a series of racks and some servers sitting idle (even though it doesn't bring in money) can be a good thing and is very heard to measure from a monetary perspective.

    For example:
    Having extra backup chassis per rack makes it easy to migrate a customer if their hardware fails. Fixing this problem quickly because there's physical space in the rack with room to work and or spare/idle servers in the rack (within reason) have a direct correlation to your reputation because the less waiting time the client has, the more likely they are to refer you. It's like an advertising budget on your colocation expense line of your income statement.

    Saying each rack absolutely must be 100% fully utilized from a cost savings perspective can sometimes cost more down the road by not being able to acheive the above result in a reasonable amount of time due to constraints (physical, inventory etc...).

    The only other way to do it and fully utilize 100% of all racks all the time with excessive capacity is to get a solution like qfabric deployed (which costs boatloads of money). Or get a big expensive chassis based redundant core switch that everything plugs in to like an EX8k series with 100k+ ARP entries and redundant re's, psu's etc.. and re-architect based on these needs. Then you can be 100% certain ARP's won't be the issue at hand.

    Best you can do is give your clients the best value for the price they pay. Better do it right and support growth now vs doing it in stages where you're constantly buying hardware to keep up.
    Well, we obviously have spare hardware and servers and empty rackspace, we just can't guarantee that the spare stuff will be in any particular rack at any particular time. If we had to guarantee that every rack had spare everything all the time, it's a much different problem than making sure you have spare something somewhere.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  32. #32
    Just to follow up, we were able to solve the 54% linecard cpu problem by disabling icmp redirects. We were able to hammer the device with a 10k arp table by running parallel instances of iperf across many 10.x.x.x IPs and have them be routed by the MLX. During the "buildup" phase of getting to 10k ARP table (over a short period, maybe a minute or less), cpu use on the line cards and management module was below 10%. During the "steady state" portion of the test, cpu use was 1% on the line card and 3% or less on the management module.

    Overall I feel confident that an ARP table limit is not likely to cause any serious concerns up to the 20k target I set for myself. Obviously any number of other things might cause performance issues, but at least the specific problem that caused our FESX448 to have issues is not likely to give us problems with the MLX when used the way we plan to use it.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  33. #33
    Join Date
    Aug 2006
    Location
    Ashburn VA, San Diego CA
    Posts
    4,571
    I wonder if the icmp redirects were also causing issues on your FESX? I doubt that 'feature' is limited to the MLX knowing brocade.
    Fast Serv Networks, LLC | AS29889 | Fully Managed Cloud, Streaming, Dedicated Servers, Colo by-the-U
    Since 2003 - Ashburn VA + San Diego CA Datacenters

  34. #34
    either way is fine, although it's nice to have the layer3 in one area (not at the top of racks in hosting), if the equipment can handle it. if you have a problem with capacity of the device you should upgrade the equipment or move the layer3 and doing so the VLAN interfaces/SVIs, and the ARP cache that goes along with the switched virtual interfaces, etc.

  35. #35
    Quote Originally Posted by FastServ View Post
    I wonder if the icmp redirects were also causing issues on your FESX? I doubt that 'feature' is limited to the MLX knowing brocade.
    Good point. I've disabled this on the FESX now but it doesn't seem to have made much difference. For what it's worth, the test case I had the MLX under was essentially worst case, since 100% of the traffic was subject to that particular "feature".
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  36. #36
    Join Date
    Jan 2003
    Location
    Chicago, IL
    Posts
    6,889
    I feel like I must be missing something, what is wrong with the "standard" of core/edge, distribution, access, with access being Layer 2, distribution bridging Layer 2 and Layer 3 (homing the VLANs), and core/edge being entirely Layer 3? We've run this way for about a decade now, with many thousands of servers, and have really seen no issue at all. It has been an extremely stable configuration and we really see no reason to change it. Resource issues at the distribution layer hasn't been a concern at all, anything we have at that layer easily supports 60,000+ entries, and the newer distribution switches we have support 256,000, more than the total number of IPs we have at any one site by a long shot. Even then, if that resource is an issue it is very simple to resolve, just create an additional network segment using separate distribution switches.

    You setup everything in a standardized manner and still maintain the portability and flexibility. Our main concern is being able to deliver to the customer what they expect, and they're not going to expect us to tell them they're out of luck when they want to add servers, or that we'll need to change their VLAN/IP space to do it, etc. they just want it to work. Not to mention, running Layer 3 from the access switch up just adds general complexity anyway, and most large companies (such as Google) have taken strong stands that you use as "dumb" of a device as possible, as it will give you higher reliability due to the decreased complexity. Now, I feel that could work perfectly fine, but I don't really see the benefit you gain from the increased complexity.
    Karl Zimmerman - Steadfast: Managed Dedicated Servers and Premium Colocation
    karl @ steadfast.net - Sales/Support: 312-602-2689
    Cloud Hosting, Managed Dedicated Servers, Chicago Colocation, and New Jersey Colocation
    Now Open in New Jersey! - Contact us for New Jersey colocation or dedicated servers

  37. #37
    Join Date
    Jan 2003
    Location
    Chicago, IL
    Posts
    6,889
    Quote Originally Posted by media-hosts_com View Post
    100% in agreement to majority of these points.

    But if you have a 42U rack, with 2x 1U switches and 40 servers at 1U per, it doesn't give much options with regards to flexibility.

    Having a few extra U's available over a series of racks and some servers sitting idle (even though it doesn't bring in money) can be a good thing and is very heard to measure from a monetary perspective.

    For example:
    Having extra backup chassis per rack makes it easy to migrate a customer if their hardware fails. Fixing this problem quickly because there's physical space in the rack with room to work and or spare/idle servers in the rack (within reason) have a direct correlation to your reputation because the less waiting time the client has, the more likely they are to refer you. It's like an advertising budget on your colocation expense line of your income statement.

    Saying each rack absolutely must be 100% fully utilized from a cost savings perspective can sometimes cost more down the road by not being able to acheive the above result in a reasonable amount of time due to constraints (physical, inventory etc...).

    The only other way to do it and fully utilize 100% of all racks all the time with excessive capacity is to get a solution like qfabric deployed (which costs boatloads of money). Or get a big expensive chassis based redundant core switch that everything plugs in to like an EX8k series with 100k+ ARP entries and redundant re's, psu's etc.. and re-architect based on these needs. Then you can be 100% certain ARP's won't be the issue at hand.

    Best you can do is give your clients the best value for the price they pay. Better do it right and support growth now vs doing it in stages where you're constantly buying hardware to keep up.
    I agree entirely with your basic premise, that often economic calculations are much harder than people are willing to accept because you really don't know the value of a happy customer or the cost of an unhappy customer, thus I feel you need to err strongly on the side of keeping the customer happy, even if it involves spending more money.

    Now, I disagree that your method is an any better method of resolving the issue. I don't see the advantage of having the extra chassis in each rack. First, you don't know the server specs the customer is having issues with (I guess unless you only do one server type per cabinet, which seems like poor power management in most cases), if it is just a chassis, then why bother even racking it if you're going to have to unrack it to put in the correct hardware to match the customer's needs anyway? Just leave the chassis in storage, where the parts are and it'll be easier to work on. There are plenty of ways to do these things quickly without needing to take up valuable real estate inside the racks.
    Karl Zimmerman - Steadfast: Managed Dedicated Servers and Premium Colocation
    karl @ steadfast.net - Sales/Support: 312-602-2689
    Cloud Hosting, Managed Dedicated Servers, Chicago Colocation, and New Jersey Colocation
    Now Open in New Jersey! - Contact us for New Jersey colocation or dedicated servers

  38. #38
    Join Date
    Oct 2009
    Location
    Canada
    Posts
    482
    Quote Originally Posted by KarlZimmer View Post
    Now, I disagree that your method is an any better method of resolving the issue. I don't see the advantage of having the extra chassis in each rack. First, you don't know the server specs the customer is having issues with (I guess unless you only do one server type per cabinet, which seems like poor power management in most cases), if it is just a chassis, then why bother even racking it if you're going to have to unrack it to put in the correct hardware to match the customer's needs anyway? Just leave the chassis in storage, where the parts are and it'll be easier to work on. There are plenty of ways to do these things quickly without needing to take up valuable real estate inside the racks.
    Some of the smaller guys don't have the luxury of on-site storage space... Our trade off is we try to standardize configs as much as possible (IE, if a customer orders 1 500gb sata and we just have a bunch of 1tb's pre-configed, we give that because it's quicker). But at least you can have a few extra U's to rack something new, and or leave some common spare parts in the rack (ram, etc...) for swapping/upgrading. Especially true if the racks are in different aisles in a shared room. Once you get in to your own cage space or dc, that's a completely different story. By no means do I advocate leaving them half empty, but most time 3-4 U is all you need.
    Owner Media-Hosts.com AS14442 Canadian Web Hosts Since 2002
    █ 24/7 365 Support, 100% Network Up-time Guarantee
    █ Web Development Specialists (E-Commerce, Inventory, Design)
    OpenVZ.ca Reliable, Affordable VPS Servers and Web Hosting. IPv6 Available

  39. #39
    Join Date
    Apr 2002
    Location
    North Kansas City, MO
    Posts
    2,565
    Quote Originally Posted by almecho View Post
    There's been a number of threads talking about the MLX router chassis' and the fact that the published specs on Brocade gear leaves a lot to be desired.

    I've got a first hand report of 3% cpu at 6k ARP on the MLX, but we also know on the FESX switches that it doesn't scale linearly and that there is an effective ARP "barrier" at some point where the MLX is likely to just die in a fire.

    I'm just wondering if there is anyone currently using an MLX in their network, and what level of cpu usage they're seeing and at what ARP table size so that we can get some idea of a realistic and reliable maximum.
    We have old MLX's (Not MLXe's) running 30,000 entry arp tables at 1% CPU utilization.

    MLX default is 8192. Minimum setting is 2000, maximum is 65536
    Aaron Wendel
    Wholesale Internet, Inc. - http://www.wholesaleinternet.net
    Kansas City Internet eXchange - http://www.kcix.net

  40. #40
    Quote Originally Posted by KarlZimmer View Post
    I feel like I must be missing something, what is wrong with the "standard" of core/edge, distribution, access, with access being Layer 2, distribution bridging Layer 2 and Layer 3 (homing the VLANs), and core/edge being entirely Layer 3? We've run this way for about a decade now, with many thousands of servers, and have really seen no issue at all. It has been an extremely stable configuration and we really see no reason to change it. Resource issues at the distribution layer hasn't been a concern at all, anything we have at that layer easily supports 60,000+ entries, and the newer distribution switches we have support 256,000, more than the total number of IPs we have at any one site by a long shot. Even then, if that resource is an issue it is very simple to resolve, just create an additional network segment using separate distribution switches.

    You setup everything in a standardized manner and still maintain the portability and flexibility. Our main concern is being able to deliver to the customer what they expect, and they're not going to expect us to tell them they're out of luck when they want to add servers, or that we'll need to change their VLAN/IP space to do it, etc. they just want it to work. Not to mention, running Layer 3 from the access switch up just adds general complexity anyway, and most large companies (such as Google) have taken strong stands that you use as "dumb" of a device as possible, as it will give you higher reliability due to the decreased complexity. Now, I feel that could work perfectly fine, but I don't really see the benefit you gain from the increased complexity.
    I would have to agree with all of this. Thanks for chiming in, especially to the point that a modern distribution device should support 60k+
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

Page 1 of 3 123 LastLast

Similar Threads

  1. Foundry/Brocade MLX RX and FGS switches available
    By serverhosts in forum Web Hosting Hardware
    Replies: 8
    Last Post: 08-12-2013, 05:24 AM
  2. Chassis 3.5 vs 2.5
    By iNetX in forum Colocation and Data Centers
    Replies: 2
    Last Post: 07-30-2012, 11:25 AM
  3. NetIron XMR Series vs NetIron MLX Series vs Cisco 6509
    By KyleLC23 in forum Colocation and Data Centers
    Replies: 24
    Last Post: 08-30-2010, 10:08 PM
  4. Replies: 2
    Last Post: 04-11-2009, 07:06 PM
  5. Plesk - Hard Quota Limits - Bug!! No limits for resellers!
    By SupermanInNY in forum Hosting Software and Control Panels
    Replies: 0
    Last Post: 07-28-2005, 03:25 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •