MySQL Architecture – What to use? MySQL replication or Galera Cluster?

In the past we have written blogs discussing MySQL asynchronous replication and providing basic insights into Galera Cluster. MySQL replication is a well-known, battle-tested solution but also Galera Cluster is a mature, well-tested product. We are quite frequently asked: which one should I choose? Would Galera Cluster work better for me than MySQL replication? This topic surfaces often in the discussion on MySQL architecture during our consulting engagements. Let’s sit down and discuss the applications of both technologies, where one fits better than the other and if they can co-exist in a single environment.

Pros and cons

MySQL replication

We have discussed it in the other blog in more detail but let’s sum it up here. MySQL replication, in its asynchronous form, is not affected by the network latency. It is fast, robust and very easy to set up. MySQL replication is also very close in terms of performance to standalone MySQL instances. Replication also allows for read scale-out but you have to be aware of read-after-write problems that you may encounter due to replication lag.

On the other hand, it lacks the ability to perform the automated failover, it has to rely on external tools for that. There is no cluster-awareness and therefore a split brain may happen and result in a data inconsistency between replication nodes. Until recently there was no way to easily provision the replicas from source nodes and now only MySQL comes with the clone plugin embedded, MariaDB lacks this functionality.

Galera Cluster

As the other solution we have the Galera cluster, which comes with multi-master capabilities. As you can write to any node it is as simple as configuring round-robin access to the databases to deal with the nodes being temporarily unavailable. HA can be easily coded into the application and you are not forced to use external tools to build highly available Galera clusters. Galera provides basic self-healing and provisioning capabilities via state transfers. What’s also great is that the lag between Galera nodes, while still may happen, is kept under control by Galera by means of flow control. It is also possible to enforce read causality on a session level, as long as you want to pay the price in the performance.

What you have to keep in mind though is that Galera performance strictly depends on the network latency and performance. Data has to be synchronized across all nodes in the cluster and multi-master deployments will result in multiple write conflicts that will have to be handled and on the application side (typically by restarting transactions). Galera cluster, as a distributed cluster, has slightly different performance characteristics than a single MySQL node. Flow control can be triggered if one of the nodes in the cluster starts to slow down and stays behind the other nodes. Hotspots (parts of the dataset that are often updated) will limit the performance based on the network latency. WAN deployments of Galera clusters, while doable, should be considered only if the tests show that the limitations that come with long distance network connectivity can be worked around or just be accepted as they are.

Different scenarios

We can approach this topic in multiple ways but let’s try to describe a couple of real world scenarios and see which technology would suit better for such a case. We are going to talk about local and distributed clusters, we’ll see how particular requirements such as failover time or performance may impact the choice.

Single Datacenter

First scenario is pretty simple. A single datacenter, on prem or in cloud, doesn’t really matter that much. What matters the most is that every element of the environment is close to the rest when it comes to networking. In this scenario we can benefit from fast networks, both in terms of the latency and throughput.

The main difference between replication cluster and Galera cluster would be the approach to high availability. In MySQL replication we have, typically, a single source node and several replicas. If the source node would become unavailable, we have to perform a failover – elect another source and make any required topology changes to connect remaining replicas to the new source. As we mentioned above, this process is not supported natively by MySQL. If you are looking for an automated failover, you have to use external tools for that. Orchestrator may be one of the solutions to use. If we want to be able to scale out the traffic, we should consider using loadbalancers like ProxySQL, they are intended to split the traffic across multiple nodes, keeping in mind that we use MySQL replication therefore we have only one writer. While a single datacenter or availability zone means the networking should be pretty straightforward and issues are unlikely to happen, MySQL replication may still be prone to partitioning and split brain scenarios.

Galera cluster, in comparison, is much easier to deal with when it comes to the high availability. There is no need for failover as all nodes are equal. Loadbalancers are not must-have as you can read from and write to any node. Not that ProxySQL can’t be useful in Galera Cluster – it sure can, ability to send all write traffic to the single node is great, not to mention all other features ProxySQL comes with. Eventual network problems should be dealt with using Galera cluster-awareness and ability to shut down partitioned segments of the cluster.

As for scaling and performance, both solutions should provide similar performance. Yes, Galera Cluster will always be slightly slower than regular MySQL due to the overhead of being a cluster. Luckily, as networks are fast and reliable, the latency shouldn’t cause too big of an impact. Still, as always, you should test if Galera works well for your application – you always have to keep in mind that Galera is not a drop-in replacement for MySQL. There are limitations and differences to check if they affect your environment or not. If you are interested in details around this topic, you can check the article from Galera’s documentation.

To sum up, as long as tests would deem Galera feasible for your environment and query mix, we would recommend to build a Galera cluster and use ProxySQL to scale out the traffic, making sure that all writes will hit a single node while reads will be distributed across the rest of the cluster.

Two Datacenters

The datacenter we mentioned here is not necessarily literally a datacenter. Single DC but two availability zones can also be considered as this scenario.

The main difference between the single DC with two Availability Zones and two DC setup is the network latency. In general, you expect latency between AZ’s still quite low while WAN connections are, typically, significantly slower because of the delay caused by the networking hardware required to keep the WAN links working but also, in some cases, due to the geographical distance between datacenters. Speed of light, even if quite fast, is not infinite.

High Availability

In this case we should consider two main factors that will determine the choice. First, high availability. For MySQL replication the main factor here will be how stable the WAN or inter-AZ link is. The issue we have to deal with is: how should the cluster behave if the link between datacenters will be down? As MySQL doesn’t come with any kind of automated failover solution out of the box, most likely you will see some external software that will be managing the topology.

The biggest issue is that, based on the information provided by MySQL replication, there’s no way to determine automatically when it is safe to perform the failover and when it will lead to the split brain, where both datacenters will have a source node and both datacenters will start accepting writes, leading to the data inconsistency between both DC:

One can come up with some solutions that, at least, will minimize the chances of split brain to happen. We can build logic based on the loadbalancers involved, available options that the failover management software provides (and we will definitely like to explore this area in the future blog posts) but it is not easy and definitely not something that will handle 100% of the cases. The only solution that will handle 100% of the cases would be not to failover automatically between datacenters and wait for the human decision. You can think of it like that – if the network is unreachable, nodes from DC B cannot check the status of nodes from DC A. They cannot execute scripts that would stop the old source instance because they cannot connect to DC A. On the other hand, a human can pick up a phone and call the DC A staff, asking to stop all of the databases and then, safely, promote one of the replicas in DC B to become a new source. As for automated failover, we can (usually) configure the failover management software to perform failover only in one of the datacenters, which gives us the ability to automatically handle the source node failures but it will work only as long as the failure is within one DC, where the source node is located.

When it comes to Galera Cluster, the situation is, actually, not too different. What you have to keep in mind is how Galera Cluster detects the split brain. In short, (the longer version will be created and linked here later), Galera requires 50% + 1 node to be a part of the cluster to assume it’s safe to work. If a node or a set of nodes consists of less than 50% + 1 nodes from the full cluster, this set will cease to handle traffic. What does this mean to our two DC setup? In short, there’s no automated handling of all of the possible cases.

If we cut off the network between both datacenters, both parts of the cluster will consist of 3 nodes. 3 out of 6 is 50%, not 50% + 1 that is required. This means that both parts will stop serve the traffic and will require manual intervention to force one of them into the “Primary” (working) state. If we are going to change the number of nodes or assign different weights to some of the nodes we will end up in a situation where the failure of the “prioritized” datacenter will turn the whole cluster in non-Primary state.

Here, the failure of DC B will leave us with three nodes working in DC A (3 out of 5 is more than 50%) but if nodes in DC A would go offline, DC B will turn into non-Primary state (2 out of 5 is less than 50% + 1) and it would require manual intervention to bring it back to the Primary state.

As you can see, one way or the other, we end up in a situation where we have to rely on not-so-automated failover, no matter if we go with MySQL replication or Galera Cluster. In both cases we can hope to prioritize one DC and ensure that the source node is there, working (or that the Galera nodes in that DC are in Primary state) but there’s no way to handle all potential crashes.

Network Latency

Second important factor is the network latency, which may limit the performance of the Galera cluster. As we have shown in one of our previous blogs, even slight network latency may result in reduced performance of the cluster – even single milliseconds may have an impact on the ability of the cluster to execute queries and increase commit times for the application. Multi-AZ environments tend to work quite fine with Galera but if we are talking about multi-DC setups, we would strongly recommend to run detailed benchmarks and see if the performance of the cluster is acceptable for you or not.

On the other hand, MySQL asynchronous replication is almost not impacted by the network latency. The only visible impact may be the delay in which data is applied on replicas – read-after-write consistency will be harder to deal with on replication clusters than on Galera. Galera has flow control that will slow down the cluster to minimize the delay between nodes. It also has options which can be used to enforce consistency across the nodes. For MySQL replication you would need to direct reads that require data consistency to the source node or, alternatively, you can use ProxySQL and its causal reads feature.

To sum up, there are pros and cons that may push you towards one of those technologies. With high availability in a problematic state, no matter if we are talking about replication or Galera, it’s mostly up to the network latency that will most likely determine if Galera is a fit or not. Please also remember that Galera comes with a couple of quality-of-life features like node provisioning and ability to write to any node. This makes it easier to use than MySQL replication if we are talking about single nodes going down. Something that requires failover done by external tools in MySQL replication cluster might be just an application reaching to another Galera node should the “main” node that app writes to become unavailable.

Three Datacenters

Here we have reached an interesting place. We have quite a complex scenario with nodes spanning across three different datacenters.

For MySQL replication nothing really changes, the only difference is that, based on the tools used for the failover and topology management, some additional options may arise to minimize the chances of split brain to happen. For the Galera cluster, when deployed like on the diagram above, when it comes to HA, everything has changed. As we mentioned earlier, Galera requires the majority of the nodes to be up for the cluster to stay in the Primary state. With three datacenters and the same number of nodes in each of them, failure of one DC only will never impact the availability of the cluster.

If our DC C is cut from the rest of the cluster due to network issues or if it just got destroyed by a hurricane or tsunami, everything’s still fine. If it’s just the network, nodes in DC C will turn to non-Primary state (they are 3 out of 9 so definitely a minority) while nodes in DC A and DC B will maintain the majority (6 out of 9) and stay in Primary state, serving the traffic as if nothing happened.

What’s good is, this doesn’t even have to be a full-blown datacenter with three production-grade database nodes. It is enough to have one node in a third DC or even a Galera arbitrator – a simple process that takes part in intra-cluster communication while not handling any kind of data. Small VM in DC C would be enough to give Galera the ability to handle failure at one of the three datacenters.

As for latency, nothing really changes other than three DC may mean higher latency, especially if they are spread geographically. If latency is low, Galera may work just fine. Please keep in mind that, even though diagrams show “WAN”, it really is about three local networks interconnected somehow. This can be London, Los Angeles and Singapore or it can be us-west-1a, us-west-1b and us-west-1c.

To sum up, as long as the network latency won’t cause issues, this is the ideal deployment for Galera cluster. Ability to survive failure of one DC (or AZ) is a nice feature.

Asynchronous Galera

Galera Cluster as a whole can be considered as a single MySQL instance for the purpose of data consistency. All Galera nodes should have the same dataset and there should be no difference to which Galera node you connect to. This allows you to set up an asynchronous replica that will be replicating off of the Galera cluster, using one of the Galera nodes as a source. This concept can be taken even further – if the Galera cluster, as a whole, can act as a source then why not use the Galera cluster as a replica?

Such a setup is perfectly viable and, as everything, comes with some pros and cons. What is great, using asynchronous replication eliminates the impact of the network latency. We are not talking here about a Galera cluster spanning across multiple datacenters but we have here two separate Galera clusters, sharing the same data set that is updated through asynchronous replication. The cons are exactly the same as what we have discussed above regarding asynchronous replication and two datacenters. After all, this is precisely that scenario. If your application writes to both DC’s, data inconsistencies may happen if the network collapses. There is no embedded mechanism to deal with the split brain and similar conditions. There is no automated failure handling mechanism for the asynchronous link. If you are aware of the limitations and you are ok with them, this setup makes Galera even more flexible and viable in even more scenarios than before.

We hope that you found this blog post interesting and useful. Yes, it was a high level overview and, as usual, the devil is in the details, but it’s good to be aware of some setups for both Galera Cluster and MySQL replication and their strong and weak points – at least to be able to pick a solution that might work for you and then proceed with extensive testing. If you have any questions or comments, we would be happy to hear from you in the comments below.