Let’s just assume we have a clean Kubernetes cluster waiting for pods to show up. Nothing has been deployed so far:
root@k8smaster:/vagrant/sbtest_cluster# kubectl get pods,pv,pvc,svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 21d root@k8smaster:/vagrant/sbtest_cluster# kubectl get nodes NAME STATUS ROLES AGE VERSION k8smaster Ready control-plane,master 21d v1.20.9 k8snode1 Ready <none> 21d v1.20.9 k8snode2 Ready <none> 21d v1.20.9 k8snode3 Ready <none> 21d v1.20.9
If you would like to follow our steps you can clone this repository: vitesstests
It contains all of the yaml (and other) files that we’ll use in next couple of blogs.
Deploying Vitess operator
The first step will be to deploy the Vitess operator. We can get it from the GitHub: https://github.com/vitessio/vitess.git
Once we have it cloned, we can spin up the operator:
root@k8smaster:~# cd /root/vitess/examples/operator/ root@k8smaster:~/vitess/examples/operator# kubectl apply -f operator.yaml Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition customresourcedefinition.apiextensions.k8s.io/etcdlockservers.planetscale.com created customresourcedefinition.apiextensions.k8s.io/vitessbackups.planetscale.com created customresourcedefinition.apiextensions.k8s.io/vitessbackupstorages.planetscale.com created customresourcedefinition.apiextensions.k8s.io/vitesscells.planetscale.com created customresourcedefinition.apiextensions.k8s.io/vitessclusters.planetscale.com created customresourcedefinition.apiextensions.k8s.io/vitesskeyspaces.planetscale.com created customresourcedefinition.apiextensions.k8s.io/vitessshards.planetscale.com created serviceaccount/vitess-operator created role.rbac.authorization.k8s.io/vitess-operator created rolebinding.rbac.authorization.k8s.io/vitess-operator created Warning: scheduling.k8s.io/v1beta1 PriorityClass is deprecated in v1.14+, unavailable in v1.22+; use scheduling.k8s.io/v1 PriorityClass priorityclass.scheduling.k8s.io/vitess created priorityclass.scheduling.k8s.io/vitess-operator-control-plane created deployment.apps/vitess-operator created
We can now see the service and pod have shown:
root@k8smaster:/vagrant/sbtest_cluster# kubectl get pods,pv,pvc,svc NAME READY STATUS RESTARTS AGE pod/vitess-operator-f44545df8-mf2f8 1/1 Running 0 32s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 21d
Create persistent volumes
Before we start deploying the Vitess cluster, we need to have persistent volumes ready. As we are using “poor man’s” Kubernetes cluster, what we have is a shared NFS volume attached to all cluster nodes. Of course, everything is running on multiple VM’s on a local server but that’s a separate story. What we have to do is to apply this yaml:
kind: PersistentVolume
apiVersion: v1
metadata:
name: pv1
labels:
type: local
spec:
persistentVolumeReclaimPolicy: Recycle
capacity:
storage: 200Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/storage/pv1"
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: pv2
labels:
type: local
spec:
persistentVolumeReclaimPolicy: Recycle
capacity:
storage: 200Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/storage/pv2"
We are not going to paste the rest of it. In total we are going to create 20 persistent volumes created in the NFS share, /storage.
root@k8smaster:~# kubectl apply -f pv.yaml persistentvolume/pv1 created persistentvolume/pv2 created persistentvolume/pv3 created persistentvolume/pv4 created persistentvolume/pv5 created persistentvolume/pv6 created persistentvolume/pv7 created persistentvolume/pv8 created persistentvolume/pv9 created persistentvolume/pv10 created persistentvolume/pv11 created persistentvolume/pv12 created persistentvolume/pv13 created persistentvolume/pv14 created persistentvolume/pv15 created persistentvolume/pv16 created persistentvolume/pv17 created persistentvolume/pv18 created persistentvolume/pv19 created persistentvolume/pv20 created
The next step, which is not included in the “official” guide, will be to create yet another persistent volume and, actually, persistent volume claim. While going through the official guide we have noticed that for larger datasets spinning new replicas becomes quite time-consuming as the operator will just start a new pod and let it provision itself via the means of replication. It quickly becomes quite slow. What we can use instead is the backup mechanism available in Vitess – new replicas can be provisioned using a backup and then the replication will have to apply only the differential change, not all of the data.
root@k8smaster:~/vitesstests# cat pv_backup.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
name: backup
labels:
type: local
spec:
persistentVolumeReclaimPolicy: Recycle
capacity:
storage: 200Gi
accessModes:
- ReadWriteMany
hostPath:
path: "/storage/backup"
What we have here is a PV that will be mounted to all pods (again, a change from the original setup). We want the backups to be shared across all pods. Luckily, the directory structure that is created by the “native” backup method in Vitess allows us to share it across multiple pods – there are no conflicts.
Finally, we are going to create a PVC on this volume, we will use it later while spinning up pods.
root@k8smaster:~/vitesstests# cat pvc_backup.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: backupvol
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 200Gi
Let’s quickly apply those and see how the resources look like right now:
root@k8smaster:~/vitesstests# kubectl apply -f pv_backup.yaml persistentvolume/backup created root@k8smaster:~/vitesstests# kubectl apply -f pvc_backup.yaml persistentvolumeclaim/backupvol created root@k8smaster:~/vitesstests# kubectl get pods,pv,pvc NAME READY STATUS RESTARTS AGE pod/vitess-operator-f44545df8-l5kk9 1/1 Running 0 31m NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/backup 200Gi RWX Recycle Bound default/backupvol 12s persistentvolume/pv1 200Gi RWO Recycle Available 5m21s persistentvolume/pv10 200Gi RWO Recycle Available 5m21s persistentvolume/pv11 200Gi RWO Recycle Available 5m21s persistentvolume/pv12 200Gi RWO Recycle Available 5m21s persistentvolume/pv13 200Gi RWO Recycle Available 5m21s persistentvolume/pv14 200Gi RWO Recycle Available 5m21s persistentvolume/pv15 200Gi RWO Recycle Available 5m21s persistentvolume/pv16 200Gi RWO Recycle Available 5m21s persistentvolume/pv17 200Gi RWO Recycle Available 5m21s persistentvolume/pv18 200Gi RWO Recycle Available 5m21s persistentvolume/pv19 200Gi RWO Recycle Available 5m21s persistentvolume/pv2 200Gi RWO Recycle Available 5m21s persistentvolume/pv20 200Gi RWO Recycle Available 5m21s persistentvolume/pv3 200Gi RWO Recycle Available 5m21s persistentvolume/pv4 200Gi RWO Recycle Available 5m21s persistentvolume/pv5 200Gi RWO Recycle Available 5m21s persistentvolume/pv6 200Gi RWO Recycle Available 5m21s persistentvolume/pv7 200Gi RWO Recycle Available 5m21s persistentvolume/pv8 200Gi RWO Recycle Available 5m21s persistentvolume/pv9 200Gi RWO Recycle Available 5m21s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/backupvol Bound backup 200Gi RWX 8s
Deploy the initial cluster
We are going to deploy the first bit of our cluster, using 101_initial_cluster.yaml file. It is a modified version of the file from Vitess guide. The main modifications, except for the resource section (we want to use a bit more data than may fit into the originally-sized pods) are mostly those:
vttablet:
extraFlags:
db_charset: utf8mb4
backup_storage_implementation: file
backup_engine_implementation: builtin
restore_from_backup: 'true'
file_backup_storage_root: /mnt/backup
Here we are passing additional flags to vttablet that are intended to configure the backup implementation (we went with builtin for now), we also configured backup storage and ensured that newly created vttablet will attempt to restore a backup (if it is available) to provision itself with data.
Then we made following changes:
extraVolumes:
- name: backupvol
persistentVolumeClaim:
claimName: "backupvol"
accessModes: ["ReadWriteMany"]
resources:
requests:
storage: 100Gi
volumeName: backup
extraVolumeMounts:
- name: backupvol
mountPath: /mnt
This ensures that an extra volume will be mounted to the vttablet and that volume will be created using “backupvol” PVC. This will allow us to spin up new replicas that will use backups in order to get the data.
Of course, in a “cloudy” environment you probably will prefer to use S3 bucket or GCS to keep the backups. In that case backups will be easily accessible for all tablets. We did not want to use cloud services thus the extra mount as a solution.
We also added a new user that we’ll use for our Sysbench later on:
---
apiVersion: v1
kind: Secret
metadata:
name: example-cluster-config
type: Opaque
stringData:
users.json: |
{
"user": [{
"UserData": "user",
"Password": ""
}],
"sbtest":
[{
"UserData": "sbtest",
"Password": "sbtest"
}]
}
As a result of applying this file we will get a set of pods, persistent volumes and persistenr volume claims in our Kubernetes cluster:
root@k8smaster:~/vitesstests# kubectl get pods,pv,pvc NAME READY STATUS RESTARTS AGE pod/example-etcd-faf13de3-1 1/1 Running 0 7h8m pod/example-etcd-faf13de3-2 1/1 Running 0 7h8m pod/example-etcd-faf13de3-3 1/1 Running 0 7h8m pod/example-vttablet-zone1-2344898534-e9abaf0e 3/3 Running 1 7h8m pod/example-vttablet-zone1-2646235096-9ba85582 3/3 Running 1 7h8m pod/example-zone1-vtctld-1d4dcad0-64668cccc8-swmj4 1/1 Running 1 7h8m pod/example-zone1-vtgate-bc6cde92-8665cd4df-kwgcn 1/1 Running 1 7h8m pod/recycler-for-pv2 0/1 Completed 0 8h pod/recycler-for-pv3 0/1 Completed 0 8h pod/recycler-for-pv6 0/1 Completed 0 8h pod/recycler-for-pv7 0/1 Error 0 8h pod/vitess-operator-f44545df8-l5kk9 1/1 Running 0 8h NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/backup 200Gi RWX Recycle Bound default/backupvol 7h56m persistentvolume/pv1 200Gi RWO Recycle Available 8h persistentvolume/pv10 200Gi RWO Recycle Available 8h persistentvolume/pv11 200Gi RWO Recycle Available 8h persistentvolume/pv12 200Gi RWO Recycle Available 8h persistentvolume/pv13 200Gi RWO Recycle Available 8h persistentvolume/pv14 200Gi RWO Recycle Available 8h persistentvolume/pv15 200Gi RWO Recycle Available 8h persistentvolume/pv16 200Gi RWO Recycle Available 8h persistentvolume/pv17 200Gi RWO Recycle Bound default/example-vttablet-zone1-2646235096-9ba85582 8h persistentvolume/pv18 200Gi RWO Recycle Bound default/example-etcd-faf13de3-1 8h persistentvolume/pv19 200Gi RWO Recycle Available 8h persistentvolume/pv2 200Gi RWO Recycle Bound default/example-vttablet-zone1-2344898534-e9abaf0e 8h persistentvolume/pv20 200Gi RWO Recycle Available 8h persistentvolume/pv3 200Gi RWO Recycle Available 8h persistentvolume/pv4 200Gi RWO Recycle Available 8h persistentvolume/pv5 200Gi RWO Recycle Available 8h persistentvolume/pv6 200Gi RWO Recycle Available 8h persistentvolume/pv7 200Gi RWO Recycle Available 8h persistentvolume/pv8 200Gi RWO Recycle Bound default/example-etcd-faf13de3-3 8h persistentvolume/pv9 200Gi RWO Recycle Bound default/example-etcd-faf13de3-2 8h NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/backupvol Bound backup 200Gi RWX 7h56m persistentvolumeclaim/example-etcd-faf13de3-1 Bound pv18 200Gi RWO 7h8m persistentvolumeclaim/example-etcd-faf13de3-2 Bound pv9 200Gi RWO 7h8m persistentvolumeclaim/example-etcd-faf13de3-3 Bound pv8 200Gi RWO 7h8m persistentvolumeclaim/example-vttablet-zone1-2344898534-e9abaf0e Bound pv2 200Gi RWO 7h8m persistentvolumeclaim/example-vttablet-zone1-2646235096-9ba85582 Bound pv17 200Gi RWO 7h8m
Populating cluster with data
We are going to use Sysbench to populate our database with data. This is an Ansible role that we used to install it on our Kubernetes master node:
---
- name: Install packages
apt:
name:
- libmysqlclient-dev
- git
- automake
- make
- libtool
- pkg-config
- libssl-dev
state: latest
install_recommends: no
update_cache: no
register: dependences
- name: Checkout sysbench
git:
repo: https://github.com/akopytov/sysbench.git
dest: /root/sysbench
register: git_checkout
- name: Run autogen.sh
command: ./autogen.sh
args:
chdir: /root/sysbench
when: dependences is changed
register: autogen
- name: Run autoreconf
command: autoreconf -f
args:
chdir: /root/sysbench
when: autogen is changed
register: autoreconf
- name: ./configure
command: ./configure
args:
chdir: /root/sysbench
when: autoreconf is changed
register: configure
- name: Make
command: make
args:
chdir: /root/sysbench
when: configure is changed
register: make
- name: Make install
command: make install
args:
chdir: /root/sysbench
when: make is changed
Before we can connect to Vitess, we have to setup the port forwarding and aliases:
root@k8smaster:~/vitesstests# ./pf.sh
Forwarding from 127.0.0.1:15000 -> 15000
Forwarding from [::1]:15000 -> 15000
Forwarding from 127.0.0.1:15999 -> 15999
Forwarding from [::1]:15999 -> 15999
Forwarding from 127.0.0.1:15306 -> 3306
Forwarding from [::1]:15306 -> 3306
You may point your browser to http://localhost:15000, use the following aliases as shortcuts:
alias vtctlclient="/root/go/bin/vtctlclient -server=localhost:15999 -logtostderr"
alias mysql="mysql -h 127.0.0.1 -P 15306 -u user"
Hit Ctrl-C to stop the port forwards
root@k8smaster:~/vitesstests# alias vtctlclient="/root/go/bin/vtctlclient -server=localhost:15999 -logtostderr"
root@k8smaster:~/vitesstests# alias mysql="mysql -h 127.0.0.1 -P 15306 -u user"
Now we can connect to VTGate proxy and run some queries to verify that the schema has been created and then to populate it.
root@k8smaster:~/vitesstests# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.7.9-vitess-12.0.0-SNAPSHOT Version: 12.0.0-SNAPSHOT (Git revision eb17117ffd branch 'main') built on Thu Sep 16 07:29:11 UTC 2021 by vitess@buildkitsandbox using go1.17 linux/amd64 Copyright (c) 2000, 2021, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show schemas; +--------------------+ | Database | +--------------------+ | sbtest | | information_schema | | mysql | | sys | | performance_schema | +--------------------+ 5 rows in set (0.00 sec)
Schema is in place, let’s run the sysbench. We’ll create four tables, one million rows each.
root@k8smaster:~/vitesstests# sysbench /root/sysbench/src/lua/oltp_read_write.lua --threads=4 --events=0 --time=3600 --mysql-host=127.0.0.1 --mysql-user=sbtest --mysql-password=sbtest --mysql-port=15306 --tables=4 --report-interval=1 --skip-trx=on --table-size=1000000 --db-ps-mode=disable prepare
sysbench 1.1.0-ead2689 (using bundled LuaJIT 2.1.0-beta3)
Initializing worker threads…
Creating table 'sbtest2'…
Creating table 'sbtest4'…
Creating table 'sbtest1'…
Creating table 'sbtest3'…
Inserting 1000000 records into 'sbtest2'
Inserting 1000000 records into 'sbtest4'
Inserting 1000000 records into 'sbtest1'
Inserting 1000000 records into 'sbtest3'
Depending on the size of the cluster nodes this may take some time. Let’s end up here for now, we’ll return shortly in another blog where we will discuss scaling up and down of the cluster.