Galera – Limitations of Galera Cluster compared to standalone MySQL

Galera is a quite useful piece of technology that allows us to build highly available database clusters. This is one of the reasons why we are hearing questions about it a lot during our MySQL consulting activities. If you wonder what it is, we have an introductory post that explains basic concepts and features of Galera Cluster.

Obviously, every solution has its limitations and Galera Cluster is no different in that. Given that we hear questions about the limitations of Galera quite a lot while working on consulting arrangements, this is why we wanted to create this blog where we would focus on the practical constraints caused by Galera Cluster limitations.

What are the limitations of Galera Cluster?

It doesn’t matter if you are running Galera Cluster, Percona XtraDB Cluster or MariaDB Cluster, neither of those solutions is a drop-in replacement for regular MySQL, Percona Server for MySQL or MariaDB. This is a quite common misconception across MySQL users. Yes, it is very similar, yes, it also shares the full dataset across all nodes but it’s not exactly the same. Let’s take a look at the differences between Galera and standalone MySQL.

Galera supports only InnoDB storage engine

MySQL is familiar for the variety of storage engines it comes with and supports. InnoDB, MyISAM, Spider, MyRocks and many more. Depending on the particular use case, the user is able to pick a storage engine that will work the best for such a case. Unfortunately, if you want to move to Galera Cluster, this flexibility is no longer the case. The only supported storage engine for Galera is InnoDB. There is experimental support for MyISAM, which was intended to help dealing with system database replication, but it’s been always marked as experimental and, as such, it’s not something a user should rely on in production.

All tables should be using Primary Key

This is, generally, a (very) good practice in MySQL in general but when we use Galera Cluster, it is even more important to ensure that all tables have Primary Key defined. This is for numerous reasons. First, as in regular MySQL, InnoDB tables without PK suffer from the performance hit. In addition to this, Galera will be unable to delete data in tables without the Primary Key. What might be even more problematic is the schema changes. We are not going into details here but when working with Galera, you quite commonly use external tools to perform the schema change – mainly pt-online-schema-change. This tool requires either a primary key or an unique index to exist in the table. Otherwise it will not be able to perform the schema change and one is forced to rely on manual schema change execution which may or may not be impactful on the whole database.

Transaction handling differences

There are several changes in how transactions work in Galera Cluster. First of all, Galera Cluster uses optimistic concurrency control which may result in rollbacks at the commit stage. What it means in real life is that when you execute a transaction on one Galera node and then another, conflicting transaction, at the same time, is executed on another Galera node, both transactions can be committed locally on their respective nodes but cluster-wide, one of those transactions will be rolled back. This means that if you execute writes to multiple nodes in the Galera cluster, you have to be ready for rollbacks and the application has to handle them properly, for example, by re-executing the transaction.

Standalone MySQL server has support for distributed transactions (XA). This is not supported by Galera.

Finally, even though there is no physical limit on the transaction size, Galera cluster, to ensure the performance is not affected, comes with an option to limit the transaction size in terms of both number of rows modified and the size of the transaction itself.

Other than that we have several small issues – some of the character sets are not supported: UTF-16, UTF-32 or UCS-2 and their use, under some circumstances, may lead to a crash. Table locking is not supported either and queries like LOCK TABLES or UNLOCK TABLES will not work properly. Table-based logging is not supported as well in Galera cluster – slow query log and general log have to be logged to file instead of the table.

Performance characteristics

The main difference between a standalone MySQL node and a Galera Cluster node is in the performance. The way Galera Cluster certifies the traffic results in a commit delay which depends on the network latency between Galera nodes. We are going to look in depth into that topic in a separate blog post, but it’s very important to understand that Galera cluster’s performance is strictly dependent on the network speed and latency. We have tested this in the past and we have published the results in one of the previous blog posts. For a local network you can assume it is quite similar to a standalone node, although even then there’s a slight delay of a few milliseconds. If you run benchmarks against regular MySQL and Galera Cluster you just have to expect some delays. For most of the cases it is not a big of a deal as a couple of milliseconds delay on COMMIT probably is not a deal-breaker in most of the cases. There are cases, though, in which it seriously impacts the application – hotspots in the table. If you have a row that is frequently updated, the frequency in which you can do it in the Galera cluster is pretty much the latency of a network round trip to Galera’s furthest node. If it takes 10 ms for a packet to get to the furthest node, you can update the row only 100 times per second.

As you can clearly see, even though Galera Cluster uses InnoDB as a storage engine and even though every Galera node in the cluster contains whole dataset, just like regular MySQL, there are several significant differences between MySQL and Galera, typically originating from the fact that Galera cluster spans across the network, which makes it impossible to use Galera Cluster as a drop-in replacement for MySQL standalone or asynchronous replication. Galera is always a viable option but you have to run detailed tests while building the proof of concept, to ensure that your application will work properly with Galera.