Results 1 to 14 of 14
  1. #1

    Question Developing Layer 7 switch/balancer. Opinions needed.

    Hello,

    It all started when we needed to switch traffic between several servers based on HTTP GET header. And moreover, we needed pretty complicated “decision making” module which would decide where (and if) to forward the request. So simple regexp match on a header was not option. We also needed replies to be sent directly to clients, and NOT via balancer. And we needed nodes to be geographically distributed…

    I found several commercial products capable of doing almost that, but priced at $3-9k and more. And there’s no stable open source software as well.

    So… we started the development of our own content-aware layer 7 switch which will “live” _inside_ Linux 2.6.18 kernel. Currently it’s a pre-development stage, we’re examining possible approaches.

    A brief list of planned features:
    1) HTTP headers lookup. Serve all your images from server A, everything else from server B.
    2) MySQL traffic switching (for MySQL clusters mostly). Forward INSERTs to server A, SELECTs to server B.
    3) URL based access control. (This is some very advanced feature)
    4) Load balancing and failover.
    5) Outgoing bandwidth limiting. (Limits rate AND BW in GBs per URL)
    6) Content maps. You don’t have to completely mirror all our content on each node. Just mirror the most used files.
    7) Google-like balancing. You can have nodes in different parts of the world and send replies directly to clients from the closest node.

    Hardware requirements:
    Any server with Linux 2.6.x on it for the balancer. Same for nodes. Nodes may also need custom kernel module to be loaded.

    Planned performance:
    Since it’s gonna be integrated into the kernel, we expect it to work pretty fast.

    My question is: should be make a commercial product out of it? It becomes one, the price will be $500-1000 per balancer, with possibility to lease at $99/mo. This includes initial installation done completely by our techs (balancer and all the nodes) and updates.

    Is there any demand for such piece of software at such price? If so, what kind of features would you like there to be?

    Thanks.

  2. #2
    That's a nice welcome to the community...

  3. #3
    Join Date
    Nov 2003
    Location
    Olde Englandshire
    Posts
    382
    Ask a mod to move this to the colocation forum - they're the sort of people who might want to use this sort of thing

  4. #4
    Join Date
    Jun 2004
    Location
    Bay Area
    Posts
    1,320
    2) MySQL traffic switching (for MySQL clusters mostly). Forward INSERTs to server A, SELECTs to server B.
    This is going to be a pretty hard thing to do - Since a client usually sets up 1 connection to a MySQL server, and uses that for both inserts and selects. It is not (like the header switching) that every new connection can just be redirected to server X or Y.

    But this certainly seems interesting, best of luck with the project

  5. #5
    Join Date
    May 2004
    Location
    Toronto, Canada
    Posts
    5,084
    Short answer: Yes of course there would be interest.

    If you can do for a fraction of the price on plain hardware what Cisco and others are doing for tens of thousands of dollars why wouldn't we want it. I certainly would be interested in helping. I suggest you do this on an opensource basis with priced support model.
    André Allen | E: aallen(a)linovus.ca
    Linovus Holdings Inc
    Shared Hosting, Reseller Hosting, VPS, Dedicated Servers & Public Cloud | USA, Canada & UK - 24x7x365 Support

  6. #6
    Quote Originally Posted by Xandrios
    This is going to be a pretty hard thing to do - Since a client usually sets up 1 connection to a MySQL server, and uses that for both inserts and selects. It is not (like the header switching) that every new connection can just be redirected to server X or Y.
    Got your point. It’s not just the MySQL related thing. It's about any software which establishes one persistent connection and then does several transactions within this single connection, so I guess we just need to decide how we gonna deal with persistent connections in general. It'll probably be possible to split such connection into several ones according to some rule.

    I have a meeting (in a couple of days) with 2 network engineers and a *nix programmer who has already developed kernel mods.

    Quote Originally Posted by CoolRaul
    If you can do for a fraction of the price on plain hardware what Cisco and others are doing for tens of thousands of dollars why wouldn't we want it. I certainly would be interested in helping. I suggest you do this on an opensource basis with priced support model.
    As for pricing, the whole idea is to be able to use entry-level servers for non-trivial layer 7 traffic switching. With the leasing option available, $149-199/mo ($99 for the software, and $49-99 for hardware) gives you an entry level balancer with some really advanced features. Currently the cheapest ticket to this opera is like $3k one-time I believe...

    I'm not sure this is going to be entirely opensource, but I guess it would be interesting to give people an option of writing custom "decision making" modules, so API can be open and probably some other parts of the code.

  7. #7
    Join Date
    Mar 2006
    Posts
    56
    I know how do all features. You dont need modified kernel of course, I recommend stay away of change kernel code to do inner changes on the userland, sticky with userland only or some freak bugs should occur. Except for mysql, I never touched mysqld code before, but you dont need do that, usually powerful softwares (even custom ones) should include support on their own mysql APIs for multiple slaves and one master, separating write/read querys, is more elegant and error free solution, if master goes down can choose one slave as new master.

    For me a powerful network filesystem, fast, distributed, failover, atomic locking is the main problem of clusters/load balacing, that really need revolutionary conception and develop a complex kernel module.
    Don't blame unmanaged services for your errors. Redundancy is the key for 100% of uptime, nothing else matter

  8. #8
    Quote Originally Posted by nsqlg
    You dont need modified kernel of course, I recommend stay away of change kernel code to do inner changes on the userland, sticky with userland only or some freak bugs should occur.
    Yes, we have several variants actually -- one is to write an iptables module working in userspace. We just don't want big overhead as it'll affect performance.

    Quote Originally Posted by nsqlg
    usually powerful softwares (even custom ones) should include support on their own mysql APIs for multiple slaves and one master, separating write/read querys
    Does PHP have such kind of master/slave mysql support? Because I couldn't find any info regarding that in their docs. The only solution for PHP, as I see it, is to write cluster-aware software which is pain in you know where. Or have complicated mysql cluster setup which is also a pain.

    Quote Originally Posted by nsqlg
    For me a powerful network filesystem, fast, distributed, failover, atomic locking is the main problem of clusters/load balacing, that really need revolutionary conception and develop a complex kernel module.
    Well, the goal of our project is not the development of software which will allow users to build clusters. This is more a Layer 7 _switch_ w/ some load balacing capabilites.

  9. #9
    Quote Originally Posted by Xandrios
    This is going to be a pretty hard thing to do - Since a client usually sets up 1 connection to a MySQL server, and uses that for both inserts and selects. It is not (like the header switching) that every new connection can just be redirected to server X or Y.

    But this certainly seems interesting, best of luck with the project

    Its not hard, we do that already.
    Jay

  10. #10
    Quote Originally Posted by jayglate
    Its not hard, we do that already.
    Which software/hardware do you use for that?

  11. #11
    Its semi-proprietary software/solution, but it is doable.
    Jay

  12. #12
    Join Date
    Mar 2006
    Posts
    56
    Quote Originally Posted by Slipper
    Does PHP have such kind of master/slave mysql support? Because I couldn't find any info regarding that in their docs. The only solution for PHP, as I see it, is to write cluster-aware software which is pain in you know where. Or have complicated mysql cluster setup which is also a pain.
    Should be more easy create patch for the PHP to support master/slave connections transparency for the scripts than create a kernel/iptables module. For your own scripts running on vanilla PHP you can write a custom mysql object frontend, separating every query to the proper server (master or slaves). Like the vbulletin does for example.

    Well, the goal of our project is not the development of software which will allow users to build clusters. This is more a Layer 7 _switch_ w/ some load balacing capabilites.
    Alright, I understand


    Regards,
    Don't blame unmanaged services for your errors. Redundancy is the key for 100% of uptime, nothing else matter

  13. #13
    Quote Originally Posted by nsqlg
    Should be more easy create patch for the PHP to support master/slave connections transparency for the scripts than create a kernel/iptables module. For your own scripts running on vanilla PHP you can write a custom mysql object frontend, separating every query to the proper server (master or slaves).
    This is certainly true, but you won't see yourself happy writing such modifications for every PHP script . Same for PHP, patching it and then re-patching with every upgrade is no fun . And if you write custom module for PHP, then again your PHP script should be this module-aware. With the layer 7 switch clustered mysql setup is seen as single DB server by applications.

    The project I'm talking about is not about just MySQL traffic switching.

    What if you have Video-On-Demand service with 1000 DVD quality movies (~5Tb of content) stored on one main server. Let's say 100 of those movies being the most popular ones are generating the most of the bandwidth.

    Now if you were to use classic mirroring (rsync or whatever) with level 3/4 load balancing, you would have to mirror the whole freaking 5Tb, because balancer doesn't know which of your mirrors have which files. Using so-called “content maps” on a L7 switch you could only mirror the most requested 100 files (500GB) -- it simulates distributed file system in some way.

    I didn’t really see commercial products supporting such kind of thing. Do you find this feature useful? Has anybody here used partial mirroring?

  14. #14
    Slipper: You can have a mutli-master for writes or even a single beefy master for writes with an autofailover which converts a slave to a master, and have a fleet of replicated reads, And is your content actually stored in the DB? or is it just a URL pointing to where the content phyically is? And you can replicate many many slaves at once, but there will obviously be some lag when an update is made and it has to replicate to all the other boxes. When you get into higher volume transactions either in a multi-master or a large # of replicated slaves if having a little of lag in the data updating between all the boxes as in ms or even a few seconds, you might need to invest many thousands into low latency Dolphin SCI cards which completely cut out the overhead associated with the TCP/IP stack and you will see MARKED improvement in performance. Contact me off board if you want to discuss more.


    Also in regards to the SQL balancing with select/updates and such it has to be middleware (eg, another layer of redundant boxes adjacent to the load balancers).

    Or if you are really serious about this you can setup a cluster using sequia or something of the sort. I wouldn't suggest using mysql cluster features in ndb that is just a HUGE nightmare and it doesn't scale very well and requires everything be stored in ram so you need MASSIVE amounts of ram since it multiplys the size of the database if you have a 1gb database you need roughly 2-3GB of ram just for that to put into the ndb cluster. Also going past 2 data nodes you once again need to invest thousands into the SCI cards because that data HAS to be k eptin sync.

    Just my .50cents worth.
    Jay

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •