As we have discussed in one of previous blogs, Galera uses a concept of quorum to determine if a given node is a part of the cluster and can proceed with handling the traffic or is it partitioned and should cease any operations. We have briefly touched how the quorum is calculated but we haven’t gone into details. Let’s take a look at it now.
What is quorum in Galera Cluster?
We are going to talk about quorum but what it really means? Galera forms a multi-node cluster where all nodes talk to each other and share their view of the cluster. Heartbeats are passing through and nodes that are detected as unavailable will be removed from the cluster. Quorum, in this context, means that Galera expects the majority of the nodes to stay alive if they are intended to continue working and serving traffic. Let’s see how it looks in practice as we go through a couple of examples.
2 nodes
Here the situation is quite simple. If one of the nodes goes down, the second node will turn to non-Primary. One out of two nodes is exactly 50%, which is not enough. Any failure will result in both nodes turning to non-Primary state as both will see only themselves.
3 nodes
If one node goes down, remaining nodes will consist of 66% of the cluster, which is, obviously, more than 50% and, as such, those two nodes will continue to operate and serve the traffic. We can say that a 3-node Galera cluster is able to tolerate failure of a single node.
4 nodes
In this case, failure of one node will be tolerated as 3 out of 4 nodes will remain (75%). If two nodes go down, the remaining nodes will turn to non-Primary state as they will form only 50% of the cluster, lacking the majority. Four node Galera cluster can survive failure of one node only.
5 nodes
If we have 5 nodes in the Galera Cluster, two nodes may crash at the same time. That will leave 3 nodes up and running, forming 60% of the cluster and as such having the quorum. Failure of 3 nodes will leave 2 out of 5 (40%) nodes remaining and they will turn into non-Primary state.
This continues onward – every two nodes added to the cluster gives it the ability to survive the crash of one more node. 3 nodes – 1 can crash. 5 nodes – 2 can crash. 7 nodes – 3 can crash and so on. Adding one node does not change anything when it comes to the ability to survive node crashes. You may have heard that you should use an odd number of nodes with Galera Cluster – this is the reason. Of course, nothing stops you from using even number of nodes. It may make sense to use 4 nodes if 3 node cluster start to be overloaded. You have to keep in mind though that adding a fourth node will not make any difference in High Availability. What’s also important to note is that 3 nodes is a minimum set that can survive the crash of a node.
How does the Galera Cluster calculate the number of nodes?
Well, the one way could be to count the number of nodes in the cluster. As simple as that. But it’s not just that. In Galera Cluster user can assign weights to nodes. You can modify the wsrep_provider_options for a given node and set pc.weight to whatever you like, making such node counted as 2, 3 or how many nodes do you like. By default every node is equal but this can be easily changed. What is important to keep in mind is, the rest of the quorum calculation is exactly the same. Let’s consider following cluster:
It consists of four physical nodes but due to weight set to 2 on one of them it is treated as a 5 node cluster. It can handle simultaneous crashes of two nodes with weight 1 or one node with weight 2.
Another quite important bit is that you can use Galera Arbitrator instead of a node. Galera Arbitrator (garbd) is a lightweight process that connects to the Galera Cluster as a node, it takes part in the quorum calculation but it doesn’t process any data. If you cannot afford three Galera nodes with the hardware specification that allows to serve the database traffic you could use two nodes + garbd on a small VM.
Please take note that we have been talking about simultaneous crashes. If we have a 5 node cluster and two and then one node crash with some time between those events, Galera may be able to reset the expected cluster size to 3, allowing one more node to go down. Additionally, this has to be a crash – unclean, unexpected shutdown of a node. If a node is gracefully stopped, it will be removed from the cluster and the cluster size will go down.
Galera Cluster in multi-DC setups
We have mentioned above that Galera Cluster should be used with three nodes as a minimum. This idea is also true for a high level architecture design.
Assuming the same number of nodes in all datacenters, two DCs are not enough for the cluster to survive the catastrophic event on one of them. We need at least one more datacenter with at least one node (or garbd) to be able to survive failure in one DC:
Here, with three DC setup, Galera Cluster is able to survive one DC being down and unavailable.
We hope this blog post has clarified any questions you may have had about quorum calculation in Galera Cluster. If not, we would love to hear from you in the comment section below.