In the last blog we have split one shard into several smaller ones, to accommodate a significant increase of load on one of the accounts. In this blog we are going to pretend that this condition no longer exists and we want to bring the cluster back to its original shape.
If you would like to follow our steps you can clone this repository: vitesstests
Currently we have five shards, each with two pods, a primary and replica:
root@k8smaster:~/vitesstests# vtctlclient listalltablets | grep newsbtest zone1-0349226440 newsbtest 55-aa primary 10.244.2.3:15000 10.244.2.3:3306 [] 2021-09-25T17:20:38Z zone1-0778238830 newsbtest b29f900000000000-b29f900000000001 primary 10.244.3.39:15000 10.244.3.39:3306 [] 2021-09-30T13:43:05Z zone1-1504968304 newsbtest -55 primary 10.244.2.254:15000 10.244.2.254:3306 [] 2021-09-25T08:01:41Z zone1-1572740937 newsbtest b29f900000000001- primary 10.244.2.5:15000 10.244.2.5:3306 [] 2021-09-30T13:35:45Z zone1-1676955594 newsbtest -55 replica 10.244.1.49:15000 10.244.1.49:3306 [] <null> zone1-2150384904 newsbtest b29f900000000000-b29f900000000001 replica 10.244.1.51:15000 10.244.1.51:3306 [] <null> zone1-2214114162 newsbtest aa-b29f900000000000 primary 10.244.2.6:15000 10.244.2.6:3306 [] 2021-09-30T13:35:43Z zone1-3429635106 newsbtest aa-b29f900000000000 replica 10.244.3.38:15000 10.244.3.38:3306 [] <null> zone1-3839184014 newsbtest b29f900000000001- replica 10.244.3.37:15000 10.244.3.37:3306 [] <null> zone1-4162850680 newsbtest 55-aa replica 10.244.1.48:15000 10.244.1.48:3306 [] <null>
What we did was to create a shard that contains a single problematic ID: b29f900000000000-b29f900000000001. Now, what we want to do is to merge this shard into the others, getting back to three shard setup: -55, 55-aa and aa-. We can approach it two-fold. We can use one of the old yaml files to create equally partitioned shards. We can also stick to the custom shards and just create another shard that will contain the range aa-. Then we will repartition the data and, eventually, wipe out the three shards that we have recently created.
Let’s stick to the custom sharding scheme, to maintain the flexibility. Technically, it will be exactly the same outcome as using equally partitioned shards but we will have all the yaml ready to add more custom shards should we need it in the future.
What has to be done is to define a shard with range which starts at aa:
- databaseInitScriptSecret:
name: example-cluster-config
key: init_db.sql
keyRange: {
start: "aa",
end: ""
}
tabletPools:
- cell: zone1
type: replica
replicas: 2
vttablet:
extraFlags:
db_charset: utf8mb4
backup_storage_implementation: file
backup_engine_implementation: xtrabackup
xtrabackup_root_path: /usr/bin
xtrabackup_user: root
xtrabackup_stripes: '8'
restore_from_backup: 'true'
file_backup_storage_root: /mnt/backup
resources:
requests:
cpu: 500m
memory: 1Gi
mysqld:
resources:
requests:
cpu: 500m
memory: 1Gi
dataVolumeClaimTemplate:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
extraVolumes:
- name: backupvol
persistentVolumeClaim:
claimName: "backupvol"
accessModes: ["ReadWriteMany"]
resources:
requests:
storage: 100Gi
volumeName: backup
extraVolumeMounts:
- name: backupvol
mountPath: /mnt
This definition is located in file 110_merge_newsbtest.yaml which we can apply. Then, once the new pods are created, we can run the merge reshard process:
root@k8smaster:~/vitesstests# vtctlclient Reshard -source_shards 'aa-b29f900000000000,b29f900000000000-b29f900000000001,b29f900000000001-' -target_shards 'aa-' Create newsbtest.reshard
As before, we should wait till it complete and we can check the progress of the reshard process:
root@k8smaster:~/vitesstests# vtctlclient Reshard Progress newsbtest.reshard Copy Progress (approx): sbtest2: rows copied 61731/1620865 (3%), size copied 14172160/370606080 (3%) sbtest1: rows copied 410084/1621541 (25%), size copied 146489344/365363200 (40%) Following vreplication streams are running for workflow newsbtest.reshard: id=1 on aa-/zone1-3262256522: Status: Copying. VStream Lag: 0s. id=2 on aa-/zone1-3262256522: Status: Running. VStream Lag: 0s. id=3 on aa-/zone1-3262256522: Status: Copying. VStream Lag: 0s.
When the process completes, as usual, we switch the traffic:
root@k8smaster:~/vitesstests# vtctlclient Reshard Progress newsbtest.reshard Copy Completed. Following vreplication streams are running for workflow newsbtest.reshard: id=1 on aa-/zone1-3262256522: Status: Running. VStream Lag: 0s. id=2 on aa-/zone1-3262256522: Status: Running. VStream Lag: 0s. id=3 on aa-/zone1-3262256522: Status: Running. VStream Lag: 0s. root@k8smaster:~/vitesstests# vtctlclient Reshard SwitchTraffic newsbtest.reshard . . . SwitchTraffic was successful for workflow newsbtest.reshard Start State: Reads Not Switched. Writes Not Switched Current State: All Reads Switched. Writes Switched
Then we can drain the unused tablets:
root@k8smaster:~/vitesstests# for vt in $(vtctlclient ListAllTablets | grep b29f90000000000 | awk '{print $1}') ; do pod="$(kubectl get pod | grep ${vt} | awk '{print $1}')" ; echo ${pod} ; kubectl annotate po
d ${pod} drain.planetscale.com/started="Cleanup of shard merge" ; done
example-vttablet-zone1-0778238830-282e9c13
pod/example-vttablet-zone1-0778238830-282e9c13 annotated
example-vttablet-zone1-1572740937-a6805ba1
pod/example-vttablet-zone1-1572740937-a6805ba1 annotated
example-vttablet-zone1-2150384904-7dbc6918
pod/example-vttablet-zone1-2150384904-7dbc6918 annotated
example-vttablet-zone1-2214114162-a3c19b79
pod/example-vttablet-zone1-2214114162-a3c19b79 annotated
example-vttablet-zone1-3429635106-554af44a
pod/example-vttablet-zone1-3429635106-554af44a annotated
example-vttablet-zone1-3839184014-2e383f9b
pod/example-vttablet-zone1-3839184014-2e383f9b annotated
Finally, we can apply 111_cleanup_after_merge.yaml and, if needed, bounce vttablet containers to clean up unneeded pods. In our case, this time, it was not needed:
root@k8smaster:~/vitesstests# kubectl get pod NAME READY STATUS RESTARTS AGE pod/example-etcd-faf13de3-1 1/1 Running 0 19d pod/example-etcd-faf13de3-2 1/1 Running 0 19d pod/example-etcd-faf13de3-3 1/1 Running 0 19d pod/example-vttablet-zone1-0349226440-35dab1bc 3/3 Running 227 10d pod/example-vttablet-zone1-0778238830-282e9c13 3/3 Terminating 1 5d6h pod/example-vttablet-zone1-1463074389-a4c6b61f 3/3 Running 1 2d22h pod/example-vttablet-zone1-1504968304-96f9a1bf 3/3 Running 1 10d pod/example-vttablet-zone1-1572740937-a6805ba1 3/3 Terminating 1 5d6h pod/example-vttablet-zone1-1676955594-dc39347b 3/3 Running 1 10d pod/example-vttablet-zone1-2150384904-7dbc6918 3/3 Terminating 1 5d6h pod/example-vttablet-zone1-2179083526-f3060bc1 3/3 Running 1 18d pod/example-vttablet-zone1-2214114162-a3c19b79 3/3 Terminating 1 5d6h pod/example-vttablet-zone1-2344898534-e9abaf0e 3/3 Running 1 19d pod/example-vttablet-zone1-2646235096-9ba85582 3/3 Running 1 19d pod/example-vttablet-zone1-3262256522-ea0a10a7 3/3 Running 1 2d22h pod/example-vttablet-zone1-3429635106-554af44a 3/3 Terminating 1 5d6h pod/example-vttablet-zone1-3839184014-2e383f9b 3/3 Terminating 1 5d6h pod/example-vttablet-zone1-4162850680-b78f527c 3/3 Running 227 10d pod/example-zone1-vtctld-1d4dcad0-64668cccc8-swmj4 1/1 Running 1 19d pod/example-zone1-vtgate-bc6cde92-8665cd4df-kwgcn 1/1 Running 1 19d pod/vitess-operator-f44545df8-l5kk9 1/1 Running 0 19d
Everything terminated nicely and gracefully. The last step that we had to perform was to recycle used pv’s (remember, we are still working on the poor man’s Kubernetes cluster):
root@k8smaster:~/vitesstests# for pv in $(kubectl get pv | grep Failed | awk '{print $1}' | cut -d / -f 2) ; do echo ${pv} ; rm -rf /storage/${pv}/ ; done
pv1
pv14
pv15
pv16
pv3
pv5
As you can see the merge process, pretty much just like almost every other operation on the shards, is very straightforward and easy to perform on Kubernetes in Vitess cluster deployed using the operator. If ease of scalability is something that you are after then Vitess is definitely something you should consider.
This blog concludes this part of the Vitess series. We have gone through adding replicas, adding shards, applying custom shard ranges and, finally, scale down through merging of the shards. We will continue looking at Vitess, this time we will try to understand how to deal with traffic, see what options are available to optimize access and queries.