Vitess – deploying a cluster

Let’s just assume we have a clean Kubernetes cluster waiting for pods to show up. Nothing has been deployed so far:

root@k8smaster:/vagrant/sbtest_cluster# kubectl get pods,pv,pvc,svc
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   21d
root@k8smaster:/vagrant/sbtest_cluster# kubectl get nodes
NAME        STATUS   ROLES                  AGE   VERSION
k8smaster   Ready    control-plane,master   21d   v1.20.9
k8snode1    Ready    <none>                 21d   v1.20.9
k8snode2    Ready    <none>                 21d   v1.20.9
k8snode3    Ready    <none>                 21d   v1.20.9

If you would like to follow our steps you can clone this repository: vitesstests

It contains all of the yaml (and other) files that we’ll use in next couple of blogs.

Deploying Vitess operator

The first step will be to deploy the Vitess operator. We can get it from the GitHub: https://github.com/vitessio/vitess.git

Once we have it cloned, we can spin up the operator:

root@k8smaster:~# cd /root/vitess/examples/operator/
root@k8smaster:~/vitess/examples/operator# kubectl apply -f operator.yaml
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/etcdlockservers.planetscale.com created
customresourcedefinition.apiextensions.k8s.io/vitessbackups.planetscale.com created
customresourcedefinition.apiextensions.k8s.io/vitessbackupstorages.planetscale.com created
customresourcedefinition.apiextensions.k8s.io/vitesscells.planetscale.com created
customresourcedefinition.apiextensions.k8s.io/vitessclusters.planetscale.com created
customresourcedefinition.apiextensions.k8s.io/vitesskeyspaces.planetscale.com created
customresourcedefinition.apiextensions.k8s.io/vitessshards.planetscale.com created
serviceaccount/vitess-operator created
role.rbac.authorization.k8s.io/vitess-operator created
rolebinding.rbac.authorization.k8s.io/vitess-operator created
Warning: scheduling.k8s.io/v1beta1 PriorityClass is deprecated in v1.14+, unavailable in v1.22+; use scheduling.k8s.io/v1 PriorityClass
priorityclass.scheduling.k8s.io/vitess created
priorityclass.scheduling.k8s.io/vitess-operator-control-plane created
deployment.apps/vitess-operator created

We can now see the service and pod have shown:

root@k8smaster:/vagrant/sbtest_cluster# kubectl get pods,pv,pvc,svc
NAME                                  READY   STATUS    RESTARTS   AGE
pod/vitess-operator-f44545df8-mf2f8   1/1     Running   0          32s
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   21d

Create persistent volumes

Before we start deploying the Vitess cluster, we need to have persistent volumes ready. As we are using “poor man’s” Kubernetes cluster, what we have is a shared NFS volume attached to all cluster nodes. Of course, everything is running on multiple VM’s on a local server but that’s a separate story. What we have to do is to apply this yaml:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv1
  labels:
    type: local
spec:
  persistentVolumeReclaimPolicy: Recycle
  capacity:
    storage: 200Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/storage/pv1"
---
kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv2
  labels:
    type: local
spec:
  persistentVolumeReclaimPolicy: Recycle
  capacity:
    storage: 200Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/storage/pv2"

We are not going to paste the rest of it. In total we are going to create 20 persistent volumes created in the NFS share, /storage.

root@k8smaster:~# kubectl apply -f pv.yaml
persistentvolume/pv1 created
persistentvolume/pv2 created
persistentvolume/pv3 created
persistentvolume/pv4 created
persistentvolume/pv5 created
persistentvolume/pv6 created
persistentvolume/pv7 created
persistentvolume/pv8 created
persistentvolume/pv9 created
persistentvolume/pv10 created
persistentvolume/pv11 created
persistentvolume/pv12 created
persistentvolume/pv13 created
persistentvolume/pv14 created
persistentvolume/pv15 created
persistentvolume/pv16 created
persistentvolume/pv17 created
persistentvolume/pv18 created
persistentvolume/pv19 created
persistentvolume/pv20 created

The next step, which is not included in the “official” guide, will be to create yet another persistent volume and, actually, persistent volume claim. While going through the official guide we have noticed that for larger datasets spinning new replicas becomes quite time-consuming as the operator will just start a new pod and let it provision itself via the means of replication. It quickly becomes quite slow. What we can use instead is the backup mechanism available in Vitess – new replicas can be provisioned using a backup and then the replication will have to apply only the differential change, not all of the data.

root@k8smaster:~/vitesstests# cat pv_backup.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
  name: backup
  labels:
    type: local
spec:
  persistentVolumeReclaimPolicy: Recycle
  capacity:
    storage: 200Gi
  accessModes:
    - ReadWriteMany
  hostPath:
    path: "/storage/backup"

What we have here is a PV that will be mounted to all pods (again, a change from the original setup). We want the backups to be shared across all pods. Luckily, the directory structure that is created by the “native” backup method in Vitess allows us to share it across multiple pods – there are no conflicts.

Finally, we are going to create a PVC on this volume, we will use it later while spinning up pods.

root@k8smaster:~/vitesstests# cat pvc_backup.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: backupvol
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 200Gi

Let’s quickly apply those and see how the resources look like right now:

root@k8smaster:~/vitesstests# kubectl apply -f pv_backup.yaml
persistentvolume/backup created
root@k8smaster:~/vitesstests# kubectl apply -f pvc_backup.yaml
persistentvolumeclaim/backupvol created
root@k8smaster:~/vitesstests# kubectl get pods,pv,pvc
NAME                                  READY   STATUS      RESTARTS   AGE
pod/vitess-operator-f44545df8-l5kk9   1/1     Running     0          31m

NAME                      CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM               STORAGECLASS   REASON   AGE
persistentvolume/backup   200Gi      RWX            Recycle          Bound       default/backupvol                           12s
persistentvolume/pv1      200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv10     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv11     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv12     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv13     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv14     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv15     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv16     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv17     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv18     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv19     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv2      200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv20     200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv3      200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv4      200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv5      200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv6      200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv7      200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv8      200Gi      RWO            Recycle          Available                                               5m21s
persistentvolume/pv9      200Gi      RWO            Recycle          Available                                               5m21s

NAME                              STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/backupvol   Bound    backup   200Gi      RWX                           8s

Deploy the initial cluster

We are going to deploy the first bit of our cluster, using 101_initial_cluster.yaml file. It is a modified version of the file from Vitess guide. The main modifications, except for the resource section (we want to use a bit more data than may fit into the originally-sized pods) are mostly those:

            vttablet:
              extraFlags:
                db_charset: utf8mb4
                backup_storage_implementation: file
                backup_engine_implementation: builtin
                restore_from_backup: 'true'
                file_backup_storage_root: /mnt/backup

Here we are passing additional flags to vttablet that are intended to configure the backup implementation (we went with builtin for now), we also configured backup storage and ensured that newly created vttablet will attempt to restore a backup (if it is available) to provision itself with data.
Then we made following changes:

            extraVolumes:
            - name: backupvol
              persistentVolumeClaim:
                claimName: "backupvol"
                accessModes: ["ReadWriteMany"]
                resources:
                  requests:
                    storage: 100Gi
                volumeName: backup
            extraVolumeMounts:
            - name: backupvol
              mountPath: /mnt

This ensures that an extra volume will be mounted to the vttablet and that volume will be created using “backupvol” PVC. This will allow us to spin up new replicas that will use backups in order to get the data.
Of course, in a “cloudy” environment you probably will prefer to use S3 bucket or GCS to keep the backups. In that case backups will be easily accessible for all tablets. We did not want to use cloud services thus the extra mount as a solution.

We also added a new user that we’ll use for our Sysbench later on:

---
apiVersion: v1
kind: Secret
metadata:
  name: example-cluster-config
type: Opaque
stringData:
  users.json: |
    {
      "user": [{
        "UserData": "user",
        "Password": ""
      }],
      "sbtest":
      [{
        "UserData": "sbtest",
        "Password": "sbtest"
      }]
    }

As a result of applying this file we will get a set of pods, persistent volumes and persistenr volume claims in our Kubernetes cluster:

root@k8smaster:~/vitesstests# kubectl get pods,pv,pvc
NAME                                                 READY   STATUS      RESTARTS   AGE
pod/example-etcd-faf13de3-1                          1/1     Running     0          7h8m
pod/example-etcd-faf13de3-2                          1/1     Running     0          7h8m
pod/example-etcd-faf13de3-3                          1/1     Running     0          7h8m
pod/example-vttablet-zone1-2344898534-e9abaf0e       3/3     Running     1          7h8m
pod/example-vttablet-zone1-2646235096-9ba85582       3/3     Running     1          7h8m
pod/example-zone1-vtctld-1d4dcad0-64668cccc8-swmj4   1/1     Running     1          7h8m
pod/example-zone1-vtgate-bc6cde92-8665cd4df-kwgcn    1/1     Running     1          7h8m
pod/recycler-for-pv2                                 0/1     Completed   0          8h
pod/recycler-for-pv3                                 0/1     Completed   0          8h
pod/recycler-for-pv6                                 0/1     Completed   0          8h
pod/recycler-for-pv7                                 0/1     Error       0          8h
pod/vitess-operator-f44545df8-l5kk9                  1/1     Running     0          8h

NAME                      CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                                                STORAGECLASS   REASON   AGE
persistentvolume/backup   200Gi      RWX            Recycle          Bound       default/backupvol                                                            7h56m
persistentvolume/pv1      200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv10     200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv11     200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv12     200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv13     200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv14     200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv15     200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv16     200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv17     200Gi      RWO            Recycle          Bound       default/example-vttablet-zone1-2646235096-9ba85582                           8h
persistentvolume/pv18     200Gi      RWO            Recycle          Bound       default/example-etcd-faf13de3-1                                              8h
persistentvolume/pv19     200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv2      200Gi      RWO            Recycle          Bound       default/example-vttablet-zone1-2344898534-e9abaf0e                           8h
persistentvolume/pv20     200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv3      200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv4      200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv5      200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv6      200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv7      200Gi      RWO            Recycle          Available                                                                                8h
persistentvolume/pv8      200Gi      RWO            Recycle          Bound       default/example-etcd-faf13de3-3                                              8h
persistentvolume/pv9      200Gi      RWO            Recycle          Bound       default/example-etcd-faf13de3-2                                              8h

NAME                                                               STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/backupvol                                    Bound    backup   200Gi      RWX                           7h56m
persistentvolumeclaim/example-etcd-faf13de3-1                      Bound    pv18     200Gi      RWO                           7h8m
persistentvolumeclaim/example-etcd-faf13de3-2                      Bound    pv9      200Gi      RWO                           7h8m
persistentvolumeclaim/example-etcd-faf13de3-3                      Bound    pv8      200Gi      RWO                           7h8m
persistentvolumeclaim/example-vttablet-zone1-2344898534-e9abaf0e   Bound    pv2      200Gi      RWO                           7h8m
persistentvolumeclaim/example-vttablet-zone1-2646235096-9ba85582   Bound    pv17     200Gi      RWO                           7h8m

Populating cluster with data

We are going to use Sysbench to populate our database with data. This is an Ansible role that we used to install it on our Kubernetes master node:

---
-   name: Install packages
    apt:
        name:
        -   libmysqlclient-dev
        -   git
        -   automake
        -   make
        -   libtool
        -   pkg-config
        -   libssl-dev
        state: latest
        install_recommends: no
        update_cache: no
    register: dependences

-   name: Checkout sysbench
    git:
        repo: https://github.com/akopytov/sysbench.git
        dest: /root/sysbench
    register: git_checkout

-   name: Run autogen.sh
    command: ./autogen.sh
    args:
        chdir: /root/sysbench
    when: dependences is changed
    register: autogen

-   name: Run autoreconf
    command: autoreconf -f
    args:
        chdir: /root/sysbench
    when: autogen is changed
    register: autoreconf

-   name: ./configure
    command: ./configure
    args:
        chdir: /root/sysbench
    when: autoreconf is changed
    register: configure

-   name: Make
    command: make
    args:
        chdir: /root/sysbench
    when: configure is changed
    register: make

-   name: Make install
    command: make install
    args:
        chdir: /root/sysbench
    when: make is changed

Before we can connect to Vitess, we have to setup the port forwarding and aliases:

root@k8smaster:~/vitesstests# ./pf.sh
Forwarding from 127.0.0.1:15000 -> 15000
Forwarding from [::1]:15000 -> 15000
Forwarding from 127.0.0.1:15999 -> 15999
Forwarding from [::1]:15999 -> 15999
Forwarding from 127.0.0.1:15306 -> 3306
Forwarding from [::1]:15306 -> 3306
You may point your browser to http://localhost:15000, use the following aliases as shortcuts:
alias vtctlclient="/root/go/bin/vtctlclient -server=localhost:15999 -logtostderr"
alias mysql="mysql -h 127.0.0.1 -P 15306 -u user"
Hit Ctrl-C to stop the port forwards

root@k8smaster:~/vitesstests# alias vtctlclient="/root/go/bin/vtctlclient -server=localhost:15999 -logtostderr"
root@k8smaster:~/vitesstests# alias mysql="mysql -h 127.0.0.1 -P 15306 -u user"

Now we can connect to VTGate proxy and run some queries to verify that the schema has been created and then to populate it.

root@k8smaster:~/vitesstests# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.7.9-vitess-12.0.0-SNAPSHOT Version: 12.0.0-SNAPSHOT (Git revision eb17117ffd branch 'main') built on Thu Sep 16 07:29:11 UTC 2021 by vitess@buildkitsandbox using go1.17 linux/amd64

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show schemas;
+--------------------+
| Database           |
+--------------------+
| sbtest             |
| information_schema |
| mysql              |
| sys                |
| performance_schema |
+--------------------+
5 rows in set (0.00 sec)

Schema is in place, let’s run the sysbench. We’ll create four tables, one million rows each.

root@k8smaster:~/vitesstests# sysbench /root/sysbench/src/lua/oltp_read_write.lua --threads=4 --events=0 --time=3600 --mysql-host=127.0.0.1 --mysql-user=sbtest --mysql-password=sbtest --mysql-port=15306 --tables=4 --report-interval=1 --skip-trx=on --table-size=1000000 --db-ps-mode=disable prepare
sysbench 1.1.0-ead2689 (using bundled LuaJIT 2.1.0-beta3)

Initializing worker threads…

Creating table 'sbtest2'…
Creating table 'sbtest4'…
Creating table 'sbtest1'…
Creating table 'sbtest3'…
Inserting 1000000 records into 'sbtest2'
Inserting 1000000 records into 'sbtest4'
Inserting 1000000 records into 'sbtest1'
Inserting 1000000 records into 'sbtest3'

Depending on the size of the cluster nodes this may take some time. Let’s end up here for now, we’ll return shortly in another blog where we will discuss scaling up and down of the cluster.