State transfer – SST or IST
Let’s start by discussing what SST is and why we may want to avoid it if possible? We are not going into details regarding different types of state transfer methods in Galera. What should suffice to say is that there are two types – Incremental State Transfer (IST) and Snapshot State Transfer – SST. IST can be described as delta – instead of transferring all the data, as long as the conditions allow that, partial data transfer can be executed. The main requirement, as you can guess, is that there is data, so adding a new node will not work with IST (unless we get that data on the new node somehow, more on that later) but it is a great mechanism for cases where node drops of the cluster for a short period of time, undergoes maintenance which results in a downtime or similar, short-lived interruption. In that case such node can ask only for the missing transactions, avoiding transferring the whole dataset. The amount of data transferred will depend on the amount of data written during the downtime period.
On the other hand, SST is, pretty much, a full backup of the donor node transferred to the joiner node. Typically XtraBackup (or MariaBackup) is used for it. It is a great mechanism as it is automatically executed but the huge downside is that the whole dataset has to be copied over the network. This leads to a significant overhead: on the I/O subsystem, as data has to be copied from disk and on the network, as the data has to be transferred. Some increase in CPU load is not unexpected, even if it is less significant than the other factors. Please keep in mind that, in Galera cluster, the whole cluster can go as fast as the slowest node and even though node that is acting as a donor will not send flow control messages (it won’t ask other nodes to slow down), impact on the network or storage (if, for example it is a SAN that is shared across multiple nodes) may still result in slow down. Not to mention, donor node can lag freely so it should not be used for production queries and taking one node out of the pool will increase load on the nodes that remain in the pool.
How to avoid SST?
As we have just established, SST is a resource and time consuming process but, on the other hand, we have also said that for new nodes, with empty data directory, IST is not a suitable solution. Can we avoid the SST in some way? Actually, yes. The solution is fairly simple – we have to provision the new node with fresh data before we attempt to join it to the cluster. How? Using backups. Backup is a must, no matter how you perform it. In some cases the backup solution is exactly the same as what SST uses – XtraBackup full or incremental. It doesn’t have to be it, it can be a block storage snapshot, LVM snapshot or anything else that does a proper, consistent and recoverable backup of the MySQL. Yes, taking the backup, no matter how you do it, will create overhead and increase load on the Galera cluster. The thing is, backups are must-have and there is no way to avoid making them. For a given environment we can come up with multiple backup scenarios, based on the technology available in that particular setup and we can try to minimize the impact on the production environment, but we all know that we have to take backups, one way or the other. On the other hand, adding a node via SST is a process that will take additional time and use additional resources. Why not use the backup that we have to create anyway and use it to provision data on the new node in order to avoid SST, which, maybe, we will not have to perform? The solution here would be to wait for a backup to finish and then immediately proceed to restore it on a new Galera node which we want to add to the cluster, reconfigure that so the rest of the cluster will see it as coming with a full dataset and just some transactions missing. If all of the conditions match, we may be able to see IST transferring a bit of missing data since the backup has been created instead of SST, which would copy the whole data from the donor node, adding the impact on top of the backup process, which is running no matter what.
The step-by-step process
We will start by copying the backup data to the node we want to add to the cluster.
root@vagrant:~/backups# scp -r back-1 192.168.10.160:/root/backups/ backup-full-2021-12-06_000007.xbstream.gz.aes256 100% 491MB 116.2MB/s 00:04 root@vagrant:~/backups# scp -r back-2 192.168.10.160:/root/backups/ backup-incr-2021-12-06_111732.xbstream.gz.aes256 100% 106MB 96.1MB/s 00:01 root@vagrant:~/backups# scp -r back-3 192.168.10.160:/root/backups/ backup-incr-2021-12-06_111822.xbstream.gz.aes256 100% 37MB 128.5MB/s 00:00 root@vagrant:~/backups# scp -r back-4 192.168.10.160:/root/backups/ backup-incr-2021-12-06_112110.xbstream.gz.aes256 100% 58MB 109.9MB/s 00:00
As you can see, we have a backup that is compressed and encrypted using the AES256 algorithm. If you do not encrypt your backups, you probably should start doing it, especially if you store them offsite. What we have is a full backup and three incremental backups.
The next step will be to decrypt the files. We have the encryption key stored in /root/keyfile.key. We are going to proceed backup by backup and decrypt the files:
root@vagrant:~/backups/back-1# cat backup-full-2021-12-06_000007.xbstream.gz.aes256 | openssl enc -d -aes-256-cbc -pass file:../keyfile.key > backup_file.xbstream.gz *** WARNING : deprecated key derivation used. Using -iter or -pbkdf2 would be better. root@vagrant:~/backups/back-1# cd ../back-2/ root@vagrant:~/backups/back-2# cat backup-incr-2021-12-06_111732.xbstream.gz.aes256 | openssl enc -d -aes-256-cbc -pass file:../keyfile.key > backup_file.xbstream1.gz *** WARNING : deprecated key derivation used. Using -iter or -pbkdf2 would be better. root@vagrant:~/backups/back-2# ls backup_file.xbstream1.gz backup-incr-2021-12-06_111732.xbstream.gz.aes256 cmon_backup.log cmon_backup.metadata root@vagrant:~/backups/back-2# cd ../back-3/ root@vagrant:~/backups/back-3# cat backup-incr-2021-12-06_111822.xbstream.gz.aes256 | openssl enc -d -aes-256-cbc -pass file:../keyfile.key > backup_file.xbstream2.gz *** WARNING : deprecated key derivation used. Using -iter or -pbkdf2 would be better. root@vagrant:~/backups/back-3# cd ../back-4/ root@vagrant:~/backups/back-4# cat backup-incr-2021-12-06_112110.xbstream.gz.aes256 | openssl enc -d -aes-256-cbc -pass file:../keyfile.key > backup_file.xbstream3.gz *** WARNING : deprecated key derivation used. Using -iter or -pbkdf2 would be better.
Now, we should proceed with the installation of the MySQL as described in the previous blog post. We should also install percona-xtrabackup-80 package, which we will use to restore the backup.
root@vagrant:~# apt install percona-xtrabackup-80 Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libcurl4-openssl-dev Suggested packages: libcurl4-doc libidn11-dev libkrb5-dev libldap2-dev librtmp-dev libssh2-1-dev libssl-dev pkg-config zlib1g-dev The following NEW packages will be installed: libcurl4-openssl-dev percona-xtrabackup-80 0 upgraded, 2 newly installed, 0 to remove and 99 not upgraded. Need to get 0 B/14.3 MB of archives. After this operation, 73.5 MB of additional disk space will be used. Do you want to continue? [Y/n] y Selecting previously unselected package libcurl4-openssl-dev:amd64. (Reading database ... 43060 files and directories currently installed.) Preparing to unpack .../libcurl4-openssl-dev_7.68.0-1ubuntu2.7_amd64.deb ... Unpacking libcurl4-openssl-dev:amd64 (7.68.0-1ubuntu2.7) ... Selecting previously unselected package percona-xtrabackup-80. Preparing to unpack .../percona-xtrabackup-80_8.0.26-18-1.focal_amd64.deb ... Unpacking percona-xtrabackup-80 (8.0.26-18-1.focal) ... Setting up libcurl4-openssl-dev:amd64 (7.68.0-1ubuntu2.7) ... Setting up percona-xtrabackup-80 (8.0.26-18-1.focal) ... Processing triggers for man-db (2.9.1-1) ...
Now all we have to do is to prepare the backup. We have to execute this step on all of the backups. On the last one we will remove the ‘–apply-log-only’ flag and do the full preparation.
First, full backup.
root@vagrant:~/backups# xtrabackup --prepare --apply-log-only --target-dir=/root/backups/full/ --incremental-dir=/root/backups/inc1 xtrabackup: recognized server arguments: --innodb_checksum_algorithm=crc32 --innodb_log_checksums=1 --innodb_data_file_path=ibdata1:100M:autoextend --innodb_log_files_in_group=2 --innodb_log_file_size=67108864 --innodb_page_size=16384 --innodb_undo_directory=./ --innodb_undo_tablespaces=2 --server-id=1000 --innodb_log_checksums=ON --innodb_redo_log_encrypt=0 --innodb_undo_log_encrypt=0 xtrabackup: recognized client arguments: --prepare=1 --apply-log-only=1 --target-dir=/root/backups/full/ xtrabackup version 8.0.26-18 based on MySQL server 8.0.26 Linux (x86_64) (revision id: 4aecf82) xtrabackup: cd to /root/backups/full/ xtrabackup: This target seems to be not prepared yet. Number of pools: 1 xtrabackup: xtrabackup_logfile detected: size=8388608, start_lsn=(4249326071) xtrabackup: using the following InnoDB configuration for recovery: xtrabackup: innodb_data_home_dir = . xtrabackup: innodb_data_file_path = ibdata1:100M:autoextend xtrabackup: innodb_log_group_home_dir = . xtrabackup: innodb_log_files_in_group = 1 xtrabackup: innodb_log_file_size = 8388608 xtrabackup: inititialize_service_handles suceeded xtrabackup: using the following InnoDB configuration for recovery: xtrabackup: innodb_data_home_dir = . xtrabackup: innodb_data_file_path = ibdata1:100M:autoextend xtrabackup: innodb_log_group_home_dir = . xtrabackup: innodb_log_files_in_group = 1 xtrabackup: innodb_log_file_size = 8388608 . . . Starting shutdown... Log background threads are being closed... Shutdown completed; log sequence number 4249326117 Number of pools: 1 211206 11:50:38 completed OK!
Now, the first incremental backup. We are using the full backup’s directory as the target directory for this incremental backup to be applied to:
root@vagrant:~/backups# xtrabackup --prepare --apply-log-only --target-dir=/root/backups/full/ --incremental-dir=/root/backups/inc1 xtrabackup: recognized server arguments: --innodb_checksum_algorithm=crc32 --innodb_log_checksums=1 --innodb_data_file_path=ibdata1:100M:autoextend --innodb_log_files_in_group=2 --innodb_log_file_size=67108864 --innodb_page_size=16384 --innodb_undo_directory=./ --innodb_undo_tablespaces=2 --server-id=1000 --innodb_log_checksums=ON --innodb_redo_log_encrypt=0 --innodb_undo_log_encrypt=0 xtrabackup: recognized client arguments: --prepare=1 --apply-log-only=1 --target-dir=/root/backups/full/ --incremental-dir=/root/backups/inc1 xtrabackup version 8.0.26-18 based on MySQL server 8.0.26 Linux (x86_64) (revision id: 4aecf82) incremental backup from 4249326097 is enabled. xtrabackup: cd to /root/backups/full/ xtrabackup: This target seems to be already prepared with --apply-log-only. Number of pools: 1 xtrabackup: xtrabackup_logfile detected: size=8388608, start_lsn=(4261947755) xtrabackup: using the following InnoDB configuration for recovery: xtrabackup: innodb_data_home_dir = . xtrabackup: innodb_data_file_path = ibdata1:100M:autoextend xtrabackup: innodb_log_group_home_dir = /root/backups/inc1/ xtrabackup: innodb_log_files_in_group = 1 xtrabackup: innodb_log_file_size = 8388608 xtrabackup: inititialize_service_handles suceeded xtrabackup: Generating a list of tablespaces xtrabackup: Generating a list of tablespaces Scanning './' Completed space ID check of 2 files. Allocated tablespace ID 20 for sbtest/sbtest14, old maximum was 0 Using undo tablespace './undo_001'. Using undo tablespace './undo_002'. Opened 2 existing undo tablespaces. xtrabackup: page size for /root/backups/inc1//undo_002.delta is 16384 bytes Applying /root/backups/inc1//undo_002.delta to ./undo_002... xtrabackup: page size for /root/backups/inc1//undo_001.delta is 16384 bytes Applying /root/backups/inc1//undo_001.delta to ./undo_001... . . . 211206 11:51:55 [00] Copying /root/backups/inc1//xtrabackup_info to ./xtrabackup_info 211206 11:51:55 [00] ...done 211206 11:51:55 [00] Copying /root/backups/inc1//xtrabackup_tablespaces to ./xtrabackup_tablespaces 211206 11:51:55 [00] ...done 211206 11:51:55 [00] Copying /root/backups/inc1/binlog.000026 to ./binlog.000026 211206 11:51:55 [00] ...done 211206 11:51:55 [00] Copying /root/backups/inc1/binlog.index to ./binlog.index 211206 11:51:55 [00] ...done 211206 11:51:55 completed OK!
Second incremental backup:
xtrabackup: recognized client arguments: --prepare=1 --apply-log-only=1 --target-dir=/root/backups/full/ --incremental-dir=/root/backups/inc2/ xtrabackup version 8.0.26-18 based on MySQL server 8.0.26 Linux (x86_64) (revision id: 4aecf82) incremental backup from 4262451122 is enabled. xtrabackup: cd to /root/backups/full/ xtrabackup: This target seems to be already prepared with --apply-log-only. Number of pools: 1 xtrabackup: xtrabackup_logfile detected: size=8388608, start_lsn=(4263628253) xtrabackup: using the following InnoDB configuration for recovery: xtrabackup: innodb_data_home_dir = . xtrabackup: innodb_data_file_path = ibdata1:100M:autoextend xtrabackup: innodb_log_group_home_dir = /root/backups/inc2/ xtrabackup: innodb_log_files_in_group = 1 xtrabackup: innodb_log_file_size = 8388608 xtrabackup: inititialize_service_handles suceeded xtrabackup: Generating a list of tablespaces xtrabackup: Generating a list of tablespaces Scanning './' Completed space ID check of 2 files. Allocated tablespace ID 20 for sbtest/sbtest14, old maximum was 0 Using undo tablespace './undo_001'. Using undo tablespace './undo_002'. Opened 2 existing undo tablespaces. xtrabackup: page size for /root/backups/inc2//undo_002.delta is 16384 bytes Applying /root/backups/inc2//undo_002.delta to ./undo_002... xtrabackup: page size for /root/backups/inc2//undo_001.delta is 16384 bytes Applying /root/backups/inc2//undo_001.delta to ./undo_001... xtrabackup: page size for /root/backups/inc2//ibdata1.delta is 16384 bytes Applying /root/backups/inc2//ibdata1.delta to ./ibdata1... xtrabackup: page size for /root/backups/inc2//mysql/wsrep_streaming_log.ibd.delta is 16384 bytes Applying /root/backups/inc2//mysql/wsrep_streaming_log.ibd.delta to ./mysql/wsrep_streaming_log.ibd... xtrabackup: page size for /root/backups/inc2//mysql/wsrep_cluster_members.ibd.delta is 16384 bytes Applying /root/backups/inc2//mysql/wsrep_cluster_members.ibd.delta to ./mysql/wsrep_cluster_members.ibd... . . . 211206 11:52:27 [00] Copying /root/backups/inc2/binlog.000027 to ./binlog.000027 211206 11:52:27 [00] ...done 211206 11:52:27 [00] Copying /root/backups/inc2/binlog.index to ./binlog.index 211206 11:52:27 [00] ...done 211206 11:52:27 completed OK!
Finally, the third incremental. Here we have removed “–apply-log-only” flag as we want the whole directory to be fully prepared. This makes it impossible to apply further incremental backups therefore it is quite important to prepare the backup directory only after the last incremental backup has been applied.
root@vagrant:~/backups# xtrabackup --prepare --target-dir=/root/backups/full/ --incremental-dir=/root/backups/inc3/ xtrabackup: recognized server arguments: --innodb_checksum_algorithm=crc32 --innodb_log_checksums=1 --innodb_data_file_path=ibdata1:100M:autoextend --innodb_log_files_in_group=2 --innodb_log_file_size=67108864 --innodb_page_size=16384 --innodb_undo_directory=./ --innodb_undo_tablespaces=2 --server-id=1000 --innodb_log_checksums=ON --innodb_redo_log_encrypt=0 --innodb_undo_log_encrypt=0 xtrabackup: recognized client arguments: --prepare=1 --target-dir=/root/backups/full/ --incremental-dir=/root/backups/inc3/ xtrabackup version 8.0.26-18 based on MySQL server 8.0.26 Linux (x86_64) (revision id: 4aecf82) incremental backup from 4263827342 is enabled. xtrabackup: cd to /root/backups/full/ xtrabackup: This target seems to be already prepared with --apply-log-only. Number of pools: 1 xtrabackup: xtrabackup_logfile detected: size=8388608, start_lsn=(4267465443) xtrabackup: using the following InnoDB configuration for recovery: xtrabackup: innodb_data_home_dir = . xtrabackup: innodb_data_file_path = ibdata1:100M:autoextend xtrabackup: innodb_log_group_home_dir = /root/backups/inc3/ xtrabackup: innodb_log_files_in_group = 1 xtrabackup: innodb_log_file_size = 8388608 xtrabackup: inititialize_service_handles suceeded xtrabackup: Generating a list of tablespaces xtrabackup: Generating a list of tablespaces Scanning './' Completed space ID check of 2 files. Allocated tablespace ID 20 for sbtest/sbtest14, old maximum was 0 Using undo tablespace './undo_001'. Using undo tablespace './undo_002'. Opened 2 existing undo tablespaces. xtrabackup: page size for /root/backups/inc3//undo_002.delta is 16384 bytes Applying /root/backups/inc3//undo_002.delta to ./undo_002... xtrabackup: page size for /root/backups/inc3//undo_001.delta is 16384 bytes Applying /root/backups/inc3//undo_001.delta to ./undo_001... xtrabackup: page size for /root/backups/inc3//ibdata1.delta is 16384 bytes Applying /root/backups/inc3//ibdata1.delta to ./ibdata1... xtrabackup: page size for /root/backups/inc3//mysql/wsrep_streaming_log.ibd.delta is 16384 bytes Applying /root/backups/inc3//mysql/wsrep_streaming_log.ibd.delta to ./mysql/wsrep_streaming_log.ibd... xtrabackup: page size for /root/backups/inc3//mysql/wsrep_cluster_members.ibd.delta is 16384 bytes Applying /root/backups/inc3//mysql/wsrep_cluster_members.ibd.delta to ./mysql/wsrep_cluster_members.ibd... xtrabackup: page size for /root/backups/inc3//mysql/wsrep_cluster.ibd.delta is 16384 bytes Applying /root/backups/inc3//mysql/wsrep_cluster.ibd.delta to ./mysql/wsrep_cluster.ibd... xtrabackup: page size for /root/backups/inc3//sys/sys_config.ibd.delta is 16384 bytes Applying /root/backups/inc3//sys/sys_config.ibd.delta to ./sys/sys_config.ibd... xtrabackup: page size for /root/backups/inc3//sbtest/sbtest25.ibd.delta is 16384 bytes . . . 211206 11:53:00 [00] Copying /root/backups/inc3/binlog.index to ./binlog.index 211206 11:53:00 [00] ...done xtrabackup: using the following InnoDB configuration for recovery: xtrabackup: innodb_data_home_dir = . xtrabackup: innodb_data_file_path = ibdata1:100M:autoextend xtrabackup: innodb_log_group_home_dir = . xtrabackup: innodb_log_files_in_group = 2 xtrabackup: innodb_log_file_size = 67108864 PUNCH HOLE support available Uses event mutexes GCC builtin __atomic_thread_fence() is used for memory barrier Compressed tables use zlib 1.2.11 Number of pools: 1 Using CPU crc32 instructions Directories to scan './' Scanning './' Completed space ID check of 39 files. Initializing buffer pool, total size = 128.000000M, instances = 1, chunk size =128.000000M Completed initialization of buffer pool page_cleaner coordinator priority: -20 page_cleaner worker priority: -20 Creating log file ./ib_logfile101 page_cleaner worker priority: -20 page_cleaner worker priority: -20 Creating log file ./ib_logfile1 Renaming log file ./ib_logfile101 to ./ib_logfile0 New log files created, LSN=4271069708 Starting to parse redo log at lsn = 4271069708, whereas checkpoint_lsn = 4271069708 and start_lsn = 4271069696 Log background threads are being started... Applying a batch of 0 redo log records ... Apply batch completed! Using undo tablespace './undo_001'. Using undo tablespace './undo_002'. Opened 2 existing undo tablespaces. GTID recovery trx_no: 6714952 Parallel initialization of rseg complete Time taken to initialize rseg using 2 thread: 7339 ms. Removed temporary tablespace data file: "ibtmp1" Creating shared tablespace for temporary tables Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... File './ibtmp1' size is now 12 MB. Scanning temp tablespace dir:'./#innodb_temp/' Created 128 and tracked 128 new rollback segment(s) in the temporary tablespace. 128 are now active. 8.0.26 started; log sequence number 4271069718 xtrabackup: starting shutdown with innodb_fast_shutdown = 1 FTS optimize thread exiting. Trying to access missing tablespace 4294967294 Starting shutdown... Log background threads are being closed... Shutdown completed; log sequence number 4271069718 211206 11:53:02 completed OK!
This is it, the backup is ready to use. It’s time to remove the contents of the MySQL data directory and then copy the backup to the data directory and, finally, set the proper owner and group for the data directory:
root@vagrant:~/backups# rm -rf /var/lib/mysql/*
root@vagrant:~/backups# cp -r /root/backups/full/* /var/lib/mysql/
root@vagrant:~/backups# chown -R mysql.mysql /var/lib/mysql/
Finally, start the MySQL:
root@vagrant:~/backups# systemctl start mysql
MySQL should begin its startup process and we can monitor the log files to see what is going on. First, logs from the joiner standpoint:
2021-12-06T12:02:38.651983Z 11 [Note] [MY-000000] [WSREP] Starting applier thread 11
2021-12-06T12:02:38.653432Z 12 [Note] [MY-000000] [WSREP] Starting applier thread 12
2021-12-06T12:02:38.653689Z 13 [Note] [MY-000000] [WSREP] Starting applier thread 13
2021-12-06T12:02:38.659104Z 3 [Note] [MY-000000] [Galera] Recovered view from SST:
id: 625a34e6-4784-11ec-899c-27548f28aad3:3029175
status: primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 2
members(4):
0: 17a36464-5427-11ec-a8e2-bb253e38d35e, 192.168.10.121
1: 427b39ef-5427-11ec-973c-3a7f10cf6e5a, 192.168.10.123
2: 512b2db4-568c-11ec-8f5d-93305897c81e, 192.168.10.160
3: e4f38c9e-5426-11ec-8a79-4bed6a113cca, 192.168.10.122
2021-12-06T12:02:38.659133Z 3 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2021-12-06T12:02:38.659268Z 14 [Note] [MY-000000] [WSREP] Recovered cluster id 625a34e6-4784-11ec-899c-27548f28aad3
2021-12-06T12:02:38.661236Z 3 [Note] [MY-000000] [Galera] SST received: 625a34e6-4784-11ec-899c-27548f28aad3:3030513
2021-12-06T12:02:38.661303Z 1 [Note] [MY-000000] [Galera] str_proto_ver_: 3 sst_seqno_: 3030513 cc_seqno: 3029175 req->ist_len(): 74
2021-12-06T12:02:38.661327Z 1 [Note] [MY-000000] [Galera] Installed new state from SST: 625a34e6-4784-11ec-899c-27548f28aad3:3030513
2021-12-06T12:02:38.661676Z 3 [System] [MY-000000] [WSREP] SST completed
2021-12-06T12:02:38.662150Z 1 [Note] [MY-000000] [Galera] Cert. index preload up to 3030513
2021-12-06T12:02:38.662205Z 0 [Note] [MY-000000] [Galera] ####### IST applying starts with 3030514
2021-12-06T12:02:38.662667Z 0 [Note] [MY-000000] [Galera] ####### IST current seqno initialized to 3012737
2021-12-06T12:02:38.662934Z 0 [Note] [MY-000000] [Galera] Receiving IST... 0.0% ( 0/16439 events) complete.
2021-12-06T12:02:38.662956Z 0 [Note] [MY-000000] [Galera] IST preload starting at 3012737
2021-12-06T12:02:38.663855Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2021-12-06T12:02:38.663898Z 0 [Note] [MY-000000] [Galera] ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:3012736, protocol version: 5
2021-12-06T12:02:38.892066Z 0 [Note] [MY-000000] [Galera] ####### Passing IST CC 3029175, must_apply: 0, preload: true
2021-12-06T12:02:38.892137Z 0 [Note] [MY-000000] [Galera] REPL Protocols: 10 (5)
2021-12-06T12:02:38.892152Z 0 [Note] [MY-000000] [Galera] ####### Adjusting cert position: 3029174 -> 3029175
2021-12-06T12:02:38.892190Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2021-12-06T12:02:38.892240Z 0 [Note] [MY-000000] [Galera] Recording CC from preload: 3029175
2021-12-06T12:02:38.892253Z 0 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from preload: 3012737
2021-12-06T12:02:38.892261Z 0 [Note] [MY-000000] [Galera] Min available from gcache for CC from preload: 3012737
2021-12-06T12:02:38.892331Z 0 [Note] [MY-000000] [Galera] Receiving IST...100.0% (16439/16439 events) complete.
2021-12-06T12:02:38.892698Z 1 [Note] [MY-000000] [Galera] Cert. index preloaded up to 3029175
2021-12-06T12:02:38.893692Z 1 [Note] [MY-000000] [Galera] Recording CC from sst: 3030513
2021-12-06T12:02:38.893726Z 1 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from sst: 3012737
2021-12-06T12:02:38.893732Z 1 [Note] [MY-000000] [Galera] Min available from gcache for CC from sst: 3012737
2021-12-06T12:02:38.907603Z 0 [Note] [MY-000000] [Galera] 2.0 (192.168.10.160): State transfer from 0.0 (192.168.10.121) complete.
2021-12-06T12:02:38.907721Z 0 [Note] [MY-000000] [Galera] SST leaving flow control
2021-12-06T12:02:38.907734Z 0 [Note] [MY-000000] [Galera] Shifting JOINER -> JOINED (TO: 3031105)
2021-12-06T12:02:39.475243Z 0 [Note] [MY-000000] [Galera] Member 2.0 (192.168.10.160) synced with group.
2021-12-06T12:02:39.475270Z 0 [Note] [MY-000000] [Galera] Shifting JOINED -> SYNCED (TO: 3031138)
2021-12-06T12:02:39.743347Z 13 [Note] [MY-000000] [Galera] Server 192.168.10.160 synced with group
2021-12-06T12:02:39.743389Z 13 [Note] [MY-000000] [WSREP] Server status change joined -> synced
2021-12-06T12:02:39.743395Z 13 [Note] [MY-000000] [WSREP] Synchronized with group, ready for connections
As you can see, the IST has been performed. Among the lines that indicate it are those:
2021-12-06T12:02:38.662205Z 0 [Note] [MY-000000] [Galera] ####### IST applying starts with 3030514 2021-12-06T12:02:38.662667Z 0 [Note] [MY-000000] [Galera] ####### IST current seqno initialized to 3012737 2021-12-06T12:02:38.662934Z 0 [Note] [MY-000000] [Galera] Receiving IST... 0.0% ( 0/16439 events) complete. 2021-12-06T12:02:38.662956Z 0 [Note] [MY-000000] [Galera] IST preload starting at 3012737
On the donor node you can also find a trace of the information that IST was executed:
2021-12-06T12:02:35.013504Z 0 [Note] [MY-000000] [Galera] Shifting DONOR/DESYNCED -> JOINED (TO: 3030627) 2021-12-06T12:02:35.014861Z 0 [Note] [MY-000000] [Galera] Member 0.0 (192.168.10.121) synced with group. 2021-12-06T12:02:35.014907Z 0 [Note] [MY-000000] [Galera] Shifting JOINED -> SYNCED (TO: 3030627) 2021-12-06T12:02:35.014927Z 11 [Note] [MY-000000] [Galera] Server 192.168.10.121 synced with group 2021-12-06T12:02:35.014936Z 11 [Note] [MY-000000] [WSREP] Server status change joined -> synced 2021-12-06T12:02:35.014940Z 11 [Note] [MY-000000] [WSREP] Synchronized with group, ready for connections 2021-12-06T12:02:35.015453Z 11 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification. 2021-12-06T12:02:43.557745Z 0 [Note] [MY-000000] [Galera] 2.0 (192.168.10.160): State transfer from 0.0 (192.168.10.121) complete. 2021-12-06T12:02:43.574714Z 0 [Note] [MY-000000] [Galera] async IST sender served 2021-12-06T12:02:44.126546Z 0 [Note] [MY-000000] [Galera] Member 2.0 (192.168.10.160) synced with group.
We can see that asynchronous IST has been served and all nodes now synchronized with the group.
As we have shown, it is possible to avoid transferring the whole data set when adding a new Galera node to the cluster. If backups are already scheduled, we can use the data backed up and pre-provision the new node so that it will contain some data and, hopefully, depending on the amount of writes to the Galera cluster, this will be enough to avoid transferring the whole data again from the donor node to the joiner saving us disk I/O, network throughput and CPU cycles.