While working with customers, performing MySQL consulting and discussing architecture, we quite frequently are being asked a very important question. Which MySQL flavour we should use? What are the pros and cons of using MySQL replication, Galera Cluster or InnoDB Cluster?
This is definitely a complex question but we thought we could prepare a couple of blog posts that would sum up the most important information regarding the available options. So, here we are, starting with the basics: MySQL replication.
MySQL replication – the basics
MySQL replication, historically, is the first solution that allowed MySQL to scale out horizontally. There are two ways in which the replication can work. Using Global Transaction ID or not. GTID is quite a new addition and it allows to identify any transaction using an identifier. This allows for quite some flexibility in making changes in the replication topology. You can take any replica from any position in the topology and reconnect it to any other location. Alternative is to rely on binary log positions, which pretty much limits you to source – replica relationships. You can move the replica a little bit but frequently you are required to stop the traffic on other nodes involved in the change. It is frustrating, limiting and outdated so let’s just forget about non-GTID setups, let them rot in the dustbin of history and let’s focus on setups that involve GTID.
How does the MySQL replication work?
We won’t be going into details, we’ll stick to the minimum needed to understand what is happening. If you are looking for in-depth coverage of the MySQL replication, you can find it in the documentation section for MySQL.
What is important to understand is that replication is intended to replicate only changes to the data set. We would say this is expected but it’s better to make it clear. Only modifications like inserting a new row, deleting existing rows, updating or replacing data is going to be replicated. As for the schema of the database, all changes are replicated as well. In short, every DDL (Data Definition Language) or DML (Data Modification Language), as well as DCL (Data Control Language – GRANT, REVOKE, queries that are used to create user and grant access) are replicated from the source node and executed on all replica nodes. This is done through binary logs. Each transaction that has been executed on the source node is stored in binary logs. Such entry contains the GTID and all information required to re-execute the change on the replicas. We will not go into details about the durability of binary logs or different formats binary logs may have. The gist is, all changes are stored there in a way that allows replicas to replay the logs.
Each replica is configured to reach out to the source and ask for binary logs with transactions containing GTID’s that the replica has not yet executed. If such transactions are still available in the binary logs on the source node, they will be sent to the replicas and stored on them in a form of relay logs. Relay logs are, pretty much, binary logs that have been transferred to replicas. The binary/relay log transfer is done by I/O thread (or threads) on replicas. Relay logs, once stored on disk, will be read by SQL thread and the transactions will be re-executed, bringing the replica in sync with the source.
MySQL native replication comes in two forms. Asynchronous and semi-synchronous. The difference between them is that asynchronous replication does not care about anything. Once the transaction is committed, it will be stored in the binary log and, eventually, retrieved by replicas and replicated. There is no guarantee that it will happen. Network may crash, servers may go down. There could be times where replicas would not be able to retrieve the binary logs before the source node goes down. In that case a data loss scenario may happen. Second method, semi-synchronous replication, is quite similar to asynchronous replication. There is only one difference. When a transaction is committed, the source node waits for a configurable number of replicas to confirm they persisted the binary logs on disk.
Another important piece of information to keep in mind is that MySQL replication, in its basic form, is not aware of the topology. Sure, InnoDB Cluster or InnoDB ReplicaSet may come with more advanced features but standard async or semi-synchronous replication created through CHANGE SOURCE command has no idea about anything other than if a replica can connect to its source and if transactions that come from the source can be processed by I/O thread on the replica. If GTID is used, topology changes, if executed, can be quite easy to perform but they will never be initiated by the database.
MySQL replication can be multi-sourced. You can design a topology in which a single replica can replicate data off multiple source nodes. This is quite useful for cases where you want to aggregate the data and, for example, run some analytical processes on it.
MySQL replication – implications
There are several very important implications that we have to take note of and remember. First, there is no guarantee on when the transaction will be executed on replicas. Yes, we can ensure that the transaction will be stored on at least one replica by means of semi-synchronous replication, but even then the promise is only that the transaction is stored on disk. There’s no promise when it will be executed by SQL thread. This, simply, means that the replication may lag on some of the replicas and there is no mechanism to keep it under control. This leads to the situation in which the source may crash while all replicas are lagging behind, making it not feasible to perform the failover.
Second, semi-synchronous replication is affected by the network latency. The slower the network, the longer is time waiting for the COMMIT. On the other hand, asynchronous replication doesn’t care about network latency at all. Each transaction executed on the source will immediately return to the application, there is no delay caused by the network. You can have the source node in London and replicas in Los Angeles and Singapore and writes done in London will not be affected by any of the replicas.
Third, as there is no replication topology awareness, there is no internal way to perform failovers. Sure, you can do it from outside using external tools like Orchestrator, but the process has to be initiated from outside of the database.
MySQL replication – pros and cons
To sum up this a bit, MySQL replication fits well in some particular scenarios, in others – not so much. The great thing about the replication is its ease of use, especially when coupled with GTID. You can quite easily deploy a robust replication stream that will let you scale out the read-only traffic. MySQL asynchronous replication works great over large distances – there is no delay in COMMIT, everything happens quickly on the source host. Replicas may be lagging but the transactions on the source instance are executed almost instantaneously, without delay caused by the network latency.
The not-so-great thing about the MySQL replication is its lack of situational awareness. There is no knowledge about the topology and the state of the nodes. There is no clustering mechanism based on a quorum. This leads to significant risk of data loss when split brain happens.
Split brain is a situation in which a network is segmented and some of the database nodes lose connection. This can be very dangerous, especially when automated failover is configured through external tools.
A failover may happen that is not really needed nor desired causing multiple writers to show up and impact the data consistency. Actually, if you think about it, there is no way to prevent data loss and split brain from impacting the cluster. Yes, you can add some precautions and sanity checks but there is no way that would prevent the split brain from inducting issues at all. The only solution is to minimize the impact, but that’s just that.
MySQL replication doesn’t provide us with automated failover. This is really a serious issue when it comes to production-grade deployments. Typically you cannot accept your production being down for an extended time period while your operational staff is performing the failover by hand. What is a must these days is a way to perform an automated failover and, natively, there’s nothing like that for MySQL replication. The only option is to either write some code that handles the failover process for your environment or to use one of existing solutions, like Orchestrator, that are able to manage the MySQL topology and perform failovers and topology changes as required.
Finally, one of the significant disadvantages of using the MySQL replication is dealing with the read-after-write challenges. When you commit the transaction on the source node, the process of replicating the data is initiated. It does not happen instantly, there is a delay before data is stored in the binary log, transferred to replica and, finally, executed there. If you, immediately after commiting a transaction, connect to a replica and attempt to read the data you have just modified, you are very likely to see the outdated state. The delay can be significant (if the replica is lagging behind its source) or minimal, but instant read after write can be a source of trouble. The solution is, obviously, to detect such cases and direct SELECT to the source node, which always can act as the source of truth, but by doing so we are significantly minimizing the ability to scale out reads. Replication lag, on its own, can also be a source of problems. If the replica cannot keep up with the source, it will lag behind and, as a result, it may not be able to act as a source candidate if failover is imminent.
As you can see, even if we are discussing only the most basic and obvious implications of using MySQL asynchronous or semi-synchronous replication, there’s quite a lot of things to consider. What is important, replication works just fine as long as you keep in mind its limitations and either accept them or work around them. Even today, especially when geo-distribution is involved, replication may be the technology of choice either as a sole solution or as a part of clustered solution – a glue that binds together multiple Galera or InnoDB clusters.