Results 1 to 14 of 14
  1. #1

    Question How to build a cluster with 3 active and 1 standby node?

    Like icdsoft? Thier cluster system is interesting, I wonder how it works?

  2. #2
    By not using any centralized storage, is this possible to do that?

  3. #3
    actually, you *do* want to use centralized storage. the most likely choice is shared scsi. sistina gfs is something to look into as well. obviously, your choice of clustering solutions depends on the applications you want to run over the cluster. mysql, for example, can not be run active/active.

    icdsoft does not seem to be running a cluster, from what i'm reading. all they are saying is that they have one cold spare for each three machines that are in production. if one box fails, they restore the latest backup (which is not real-time) and put it up. this is hardly a cluster and hardly unusual.

    paul
    * Rusko Enterprises LLC - Upgrade to 100% uptime today!
    * Premium NYC collocation and custom dedicated servers
    call 1-877-MY-RUSKO or paul [at] rusko.us

    dedicated servers, collocation, load balanced and high availability clusters

  4. #4
    I've built one of the first commercial asymmetrical cluster for JPMorgan in 1997. It has three separate nodes and group of hard disks configured in RAID level 0 between them. In this scheme two nodes are active "Primary" nodes and one standby "Secondary" node capable to take load automatically (without human intervention) from each "Primary" nodes and under catastrophic failure from both nodes with some reduction in service load in this case. Theoretically this solution can be configured from N-1 to 1-N asymmetrical scheme, where N is less than 8.

    Peter Kinev.
    Open Solution, Inc
    http://opensolution-us.com

  5. #5
    peter,

    tell us more about it? this was shared scsi, correct? what is/was running on top of the cluster?

    you can build all manners of clusters really, that is not necessarily the point. for hosting, you are stuck with things like mysql, MTAs etc, which all have their constraints.

    paul
    * Rusko Enterprises LLC - Upgrade to 100% uptime today!
    * Premium NYC collocation and custom dedicated servers
    call 1-877-MY-RUSKO or paul [at] rusko.us

    dedicated servers, collocation, load balanced and high availability clusters

  6. #6
    rusko,

    I wonder how a shared scsi works? Connect the servers together or using a NAS/SAN storage device?

  7. #7
    Originally posted by rusko
    peter,

    tell us more about it? this was shared scsi, correct? what is/was running on top of the cluster?

    you can build all manners of clusters really, that is not necessarily the point. for hosting, you are stuck with things like mysql, MTAs etc, which all have their constraints.

    paul
    Paul,

    - yes, shareable disks should be visible always by all nodes in the cluster
    - no, cluster solution of this type has no constrains associated with applications on the top, mysql or any other. The concept is quite simple - all shareable resources such as databases, singular applications to the cluster, common configuration files, etc should be placed on the shareable disks.

    Theoretically it could be built with many systems components. For instance, Linux as operating system is a good starting point, MS is not.

    Although it's not obvious, but asymmetrical cluster probably is the most cost effective solution to improve system availability up to "five nines" - 99.999. It also allows to organize a preventive system maintenance based on the estimation of hardware/software MTBF - Lexus, Mercedes models

    Peter Kinev.
    Open Solution, Inc
    http://opensolution-us.com

  8. #8
    Join Date
    Aug 2002
    Location
    Seattle
    Posts
    5,525
    This is a topic i've been curious about myself. I do not operate any clusters, but I do have customers that have been wanting one.

    Any specific suggestions at to what hardware to use in the said configurations?

  9. #9
    peter,

    Originally posted by pnorilsk
    Paul,

    - yes, shareable disks should be visible always by all nodes in the cluster
    obviously, i was not doubting that. i asked whether that was what you used, as there are other (worse) options.


    - no, cluster solution of this type has no constrains associated with applications on the top, mysql or any other. The concept is quite simple - all shareable resources such as databases, singular applications to the cluster, common configuration files, etc should be placed on the shareable disks.
    so they share data, boo hoo. they do not share the execution domain, unless this was a single image cluster. please tell me how you would run mysql in an active/active config on top of shared scsi storage, unless there was a solution out there tha could migrate threads.


    Although it's not obvious, but asymmetrical cluster probably is the most cost effective solution to improve system availability up to "five nines" - 99.999. It also allows to organize a preventive system maintenance based on the estimation of hardware/software MTBF
    a two-node active/passive failover cluster is cheaper, less complex, has fewer constraints on the apps running on top of it and can be done cheaper since you don't necessarily need shared scsi (with linux at least).

    what your setup gives you is load balancing across two boxes concurrently. great deal if you can get it done, but some apps just won't do it.

    paul
    * Rusko Enterprises LLC - Upgrade to 100% uptime today!
    * Premium NYC collocation and custom dedicated servers
    call 1-877-MY-RUSKO or paul [at] rusko.us

    dedicated servers, collocation, load balanced and high availability clusters

  10. #10
    Originally posted by rusko
    peter,



    obviously, i was not doubting that. i asked whether that was what you used, as there are other (worse) options.



    so they share data, boo hoo. they do not share the execution domain, unless this was a single image cluster. please tell me how you would run mysql in an active/active config on top of shared scsi storage, unless there was a solution out there tha could migrate threads.



    a two-node active/passive failover cluster is cheaper, less complex, has fewer constraints on the apps running on top of it and can be done cheaper since you don't necessarily need shared scsi (with linux at least).

    what your setup gives you is load balancing across two boxes concurrently. great deal if you can get it done, but some apps just won't do it.

    paul
    I don't think I will be able to add anything to your statements.

    Peter Kinev
    Open Solution, Inc
    http://opensolution-us.com

  11. #11
    Originally posted by DeathNova
    This is a topic i've been curious about myself. I do not operate any clusters, but I do have customers that have been wanting one.

    Any specific suggestions at to what hardware to use in the said configurations?
    It could be done almost on any hardware with some peculiarities in disks architecture definition and implementation. But it will take a lot of software work.

    Peter Kinev
    Open Solution, Inc
    http://opensolution-us.com

  12. #12
    Excuse me so how does a shared scsi works?...

    2 pairs of load balancing with 2 nodes each or a 3 active nodes, 1 fail-over cluster is more cost-effective?

    Can a NAS device handle the loads of 3 servers? Will it cause latency?

    Thank you very much.
    Last edited by nowisph; 05-31-2004 at 08:39 PM.

  13. #13
    Originally posted by nowisph
    Excuse me so how does a shared scsi works?...

    2 pairs of load balancing with 2 nodes each or a 3 active nodes, 1 fail-over cluster is more cost-effective?

    Can a NAS device handle the loads of 3 servers? Will it cause latency?

    Thank you very much.
    OK, first we need to clear up some confusion in terminology. There are many clustering schemes. The original post is about a specific clustering scheme - asymmetrical cluster, where more than one active nodes will have one or more standby nodes. The node constitutes a computer with local CPU(s) and local memory - these resources are not shareable. There is a very sophisticated mathematical/statistical model for description of this system. We don't want to spend time on that. Let us compare two typical schemes. One, where only two nodes are in the cluster - typical symmetrical scheme. Where one "primary" node will work with 100% load and memory use. In this situation a "standby" node cannot be used for any other work but only to wait until "primary" node will fail. In all practical reason this scheme will produce 50% of utilization for two nodes. Compare it with a scheme when two "primary" nodes are loaded with 100% load each and "standby" is waiting for failure of one or another. This scheme can be extended from 7 "primary" with 1 "standby" to 1 "primary" and 7 "standby". So, with 2*100% + 1*0% we will have 66% utilization (much better than 50%). This is very simplistic estimation, actually it it more convoluted, but you get an idea.

    Now, what will constitute "shareable" resources. Typically it's a resource which must be available at all time - databases, applications, configuration files, etc. The best approach for this type of cluster is to put them on a group of redundant disks. And connect these disks to all involved nodes in the cluster with one or another connection bus. I never mention SCSI - it was my intention. The bus could be any suitable technological solution for this purpose.

    BTW, we focused only on N-1 asymmetrical high availability clustering solution. We should recognize that other solutions are available, such as parallel processing, grid computing, symmetrical multiprocessing, etc.

    We also didn't discuss load balancing, XA-compliant and non-compliant databases/drivers, sessions affinity in cluster, and other topics mentioned somewhere in this thread. It has nothing to do with a fundamental definition of "cluster".

    Hope it will help.

    Peter Kinev.
    Open Solution, Inc
    http://opensolution-us.com

  14. #14
    Originally posted by nowisph
    Excuse me so how does a shared scsi works?...
    in *very* simplistic terms, it's a (set) of scsi disks (usually in raid-1+) which can be connected to several machines simultaneously.


    2 pairs of load balancing with 2 nodes each or a 3 active nodes, 1 fail-over cluster is more cost-effective?
    woah there, massive terminology problem. although peter prefers to describe this differently (ie via the math behind the number of nodes, quorum etc), i find it more helpful to describe things as follows.

    there are several metatypes of cluster, which can be combined in certain cases:

    1. load-balanced. you must have at least 2 nodes and load is divided between the two. if the application you are load balancing is read-only (ie a webserver and a static dataset), there need not be a shared resource. both nodes (or n nodes) are active (primary, if you will) concurrently.

    2. fail-over. you must have at least 2 nodes. typically, this will be an active/passive configuration, where one node is primary and has 'custody' of the shared resource with the passive (standby) node able and ready to take over the shared resource should the primary fail.

    the cluster peter described is a 3 node cluster that combines load balancing and failover, in that the two active nodes are being load balanced and the standby node is able to take over on failure of one of the nodes. personally, i am on the fence about the merits of this configuration. *unless* the application was only able to handle two-node concurrency, it would make more sense to have all three nodes active (load balanced) with the cluster manager being able to detect failure. in the worst case scenario, you would still have one or two nodes active, while in the best case scenario you would have three nodes active. since peter seems to be a fan of the math behind clustering, it should be noted that quorum is a much simpler problem in a 3 node cluster than it is in a 2 node cluster, where the only solution is a temporal tiebreaker, if you will, with STONITH enforcement.

    3. single-image. this is very nifty from a technological standpoint. originally designed for hpc scientific applications in an effort to parallelize long-running and fairly independent number-crunching, this is basically an attempt to create a shared execution environment across all the nodes. in extremely simplistic terms, it creates a single super-computer, if you will, out of all the nodes. this comes with the ability to migrate processes, share memory across nodes etc. due to the complexity of this, i will stop right here since a proper treatment of the subject would require way more time than i am willing to put in as well as way more background knowledge on your part. google is your friend; starting with openmosix's faq would be a good idea as well, if you are interested.

    this is likely *not* what you are looking for, since this solution is aimed at parallelizing fairly independent cpu-intensive applications.

    4. grids. overlapping with the above (ssi is *one* way to to do grid), this is basically a way to parallelize loosely-coupled applications. seti@home and all the distributed crypto challenges would be familiar examples of grid computing in action.


    as far as your question goes, provided that our assumption that you are interested in clustering an rdbms is correct, my statement still stands. in order to have load-balancing, you must have two active nodes working on the needed dataset. in the rdbms case, the dataset needs to be the same across both nodes. if the dataset is read-only or has very loose coherency time bounds, you can do this with any rdbms. more likely is that the dataset is read-write with a coherency bound of 0 (ie consistency). in this case, you will either need to use an rdbms that can handle active/active on a rw dataset (oracle rac) or simplify the problem you are solving by constraining yourself to failover only without load balancing.

    for failover, you need an active/passive setup, where the passive node does not need to have access to the shared resource before it takes that resource over on failure. as such, you have just gotten rid of the shared scsi requirement, compliant rdbms requirement and a lot of headaches in terms of implementation. my statement still stands - if all you need is high-availability, forget about load balancing rdbms and do failover only.

    with that said, you have not given us any useful information about what you are trying to do. clustering is not a magic black box process - there is very complex technology at play, each with their own set of constraints, so what is technologically and financially feasible depends on the exact problem being solved and the exact requirements for said solution.



    Can a NAS device handle the loads of 3 servers? Will it cause latency?
    there are nas devices and nas devices. high-end netapp and emc boxes can handle thousands of clients. they can also be deployed redundantly, so that you would not in effect be adding another single point of failure. a linux box with nfsd and lockd running is a nas box as well, but then you are in the same boat of having to muck with making it n+1 redundant (although this is much easier than rdbms).

    load is not going to be your problem, latency is. latency is *not* caused by load in this case, it is caused by the speed of the 'bus'. in this case, the bus is whatever the data travels over on its way to the nas box. in your case, this would most likely be ethernet. as such, you would go from ns latencies to ms latencies, which is a difference of a few orders of magnitude. imo, this would make it unacceptable for oltp where performance is an issue, although i've seen it done under specific circumstances.

    the primary purpose of nas boxes is centralized storage in an enterprise environment. shared scsi or an emulation thereof in software is what you are looking for.


    i *urge* you to google and do some research, since i'm pretty sure that all i've written is available somewhere out there. this is not a 'buy server from rackshack, install a few rpms and voila' type of deal - you need to bring some background knowledge to the table if you want to discuss this in a sapient manner, instead of it deteriorating into a soliloquy where i rehash the basics.

    hope this helps,
    paul
    * Rusko Enterprises LLC - Upgrade to 100% uptime today!
    * Premium NYC collocation and custom dedicated servers
    call 1-877-MY-RUSKO or paul [at] rusko.us

    dedicated servers, collocation, load balanced and high availability clusters

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •