StatefulSets: An Introduction

July 15, 2021

In previous parts of this series, we walked you through StorageClass as one of the Kubernetes objects for data persistence. Let’s now look at another persistent data object referred to as a StatefulSet. We’ll cover the topics below and some hands-on practice to show you the functionalities of this object.

What is a StatefulSet?
How do StatefulSets differ from Deployments?
How to specify Pods easily inside of StatefulSets?
How to manage Volumes in a Pod?
Why use a Service for StatefulSets?

What Is a StatefulSet?

A StatefulSet is a Kubernetes object used to deploy and manage stateful applications. Stateless applications, you may recall from a prior part of this series, are deployed using various Kubernetes objects like Deployments and Pods. We then covered data persistence to manage Stateful applications. So, what are stateful and stateless applications?

Stateful Applications are applications that are mindful of their past and present state. In a nutshell, the applications that monitor or keep track of their state. They store data using persistent storage and read the data later to survive service breakdown or restarts. Database applications like MySQL and MongoDB are examples of stateful applications.

On the other hand, Stateless Applications are applications that do not monitor any of their states. They neither store nor read data from any storage; they are basically a one-time request and feedback process. When a stateless application’s current session is down, interrupted or deleted, the new session will start with a clean slate, without referring to past events or processes. Examples include the Nginx web application and Tomcat web server.

How do StatefulSets differ from Deployments?

StatefulSets vs Deployments:

StatefulSet	Deployment
Used to deploy stateful applications.	Used to deploy stateless applications.
Pods created by Statefulsets have unique names which remain constant across application rescheduling.	Pods created by Deployment have dynamic, random names and numbers that change across application rescheduling.
Its Pods are created in sequential order and deleted in reverse, sequential order.	Its Pods are created and deleted randomly.
Its Pods are not interchangeable and maitain their identities after restarts.	Its Pods are interchangeable and do not maintain their identities after restarts
It does not allow shared volume. Thus, each Pod replica has its own sticky Volume and PersistentVolumeClaim.	It allows shared volume via Volume and PesistentVolumeClaim across all of the Pod replicas.
Replication is complex.	Replication is easier.

So, if you need a:

unique and ordered deployment and scaling,
distinct and stable network identities,
steady and persistent storage across all application scheduling and rescheduling,
then, you would use StatefulSet instead of Deployment.

How to Specify Pods Inside StatefulSets

Pods in a StatefulSet have a sticky and unique network identity. They are specified inside a StatefulSet by declaring a “replicas” field as a child of a “spec” property in the StatefulSet YAML manifest. The desired number of Pod depends on the “replicas” value specified in the manifest file. The configuration will look like this:

spec:
  selector:
    matchLabels:
      app: service-label  ## This must be the same as the Pod template and service labels 
    replicas: 3   ## It is 1 by default. The value specified here will determine the number of replicated Pods the StatefulSet will create; in this case, it will be 3 Pods.

How to Manage volumes in the specified Pod in a StatefulSet

In the last part of this series, we created a Pod that consumes storage as a volume using PVC. A “persistentVolumeClaim” field was declared in the manifest YAML file which gave the Pod access to the PersistentVolume that the PVC is bound to. In the case of a StatefulSet, a different property will be used, namely “volumeClaimTemplates”. The template name, accessModes, storageClassName and storage requests fields are declared under this property. The Pods access the storage through this section and then mount it into the Pods’ containers using the “volumeMounts” field. The claim and mount configuration in a StatefulSet manifest YAML file will look like this:

volumeMounts:
        - name: my-volume
          mountPath: /data/path
  volumeClaimTemplates:
  - metadata:
      name: my-volume
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "my-stg-class"      ## Dynamic storage provisioning 
      resources:
        requests:
          storage: 1Gi          ## Storage request

You’ll see more about this when we get to “How to create a StatefulSet session” later in this blog post.

What Is the Role of a Service in StatefulSets?

A Service is needed in a StatefulSet for communication across Pods. It links all of the Pods in the StatefulSet and also controls its network domain. The question is, what type of service is suitable for a StatefulSet? A StatefulSet needs a headlessService for Pods discovery and to maintain the Pods’ sticky network identity, which is one of the characteristics of StatefulSets. Moreover, you need another Service type to expose the application to the outside world or get an external IP. You can read more on Services in an earlier part of this series.

The Headless Service is referenced in the StatefulSet manifest YAML file by declaring a “serviceName” field as a child of a “spec” property. The value of this field must be the same as the Headless Service name.

How to Create a StatefulSet

A StatefulSet is created by declaring a manifest YAML file just like Deployment, but with a different “kind” value; in this case, StatefulSet. There are also various components highlighted above that are needed to create a StatefulSet. These are Headless Service, persistentVolume (PV) and persistentVolumeClaim (PVC). PV & PVC are required if you are using static storage provisioning in a local cluster. If, however, you are using cloud provider storage from AWS, Azure or GCP, which allows for dynamic storage provisioning, creating a storageClass object would be a suitable option. You can read more on static (PV & PVC) and dynamic (storageClass) storage provisioning methods in a previous part of this series.

Before we begin, it is advised to have a basic knowledge of Kubernetes objects like Pod, Deployment, Service, volumes & volumeMount, PV, PVC and storageClass, to follow this exercise. We will go through a hands-on practice on a running Kubernetes cluster, so it’s imperative to have one with the kubectl command-line tool already configured to talk to the cluster. KubeOne allows you to create a Kubernetes cluster in any environment easily. Check out our documentation on this to get started. Alternatively, you can just use the Kubernetes playground to practice.

The following steps will guide you on how to create a StatefulSet and other necessary components.

First, create the headless service by setting the clusterIP field to “None”. The configuration will look like this:

Step 1:

$ vim headless-service.yaml

Copy the below configuration into the above file.

apiVersion: v1
kind: Service
metadata:
  name: my-service
  labels:
    app: service-label
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None    ## Headless service
  selector:
    app: service-label

Step 2: Create the headless service using the kubectl create command.

$ kubectl create -f headless-service.yaml
service/my-service created

Step 3: Use kubectl get command to check the details of the service.

$ kubectl get service my-service

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
my-service   ClusterIP     None        <none>       80/TCP    37s

Step 4: Create a storageClass with the below YAML manifest file. storageClass is used in this case because a host cloud provider was used for the cluster. If you are using a local cluster, you will need to create PV and PVC. Check our previous post on how to provision storage using PV and claim it using PVC.

$ vim s-class.yaml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: my-stg-class    ## Name of the SC that will be referenced in the StatefulSet manifest file
provisioner: kubernetes.io/aws-ebs  ## Provisioner for AWSElasticBlockStore plugin 
volumeBindingMode: WaitForFirstConsumer # The binding will wait until the StatefulSet is created

Use kubectl create command to create the storageClass.

$ kubectl create -f s-class.yaml
storageclass.storage.k8s.io/my-storageclass created

Step 5: Check the details of the storageClass using kubectl get command:

$ kubectl get storageclass my-storageclass

NAME                    PROVISIONER       RECLAIMPOLICY    VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION    AGE
my-storageclass   kubernetes.io/aws-ebs      Delete       WaitForFirstConsumer            false          3m37s

Step 6: Now copy and paste the below configuration in a YAML file with your choice’s name to create a StatefulSet object.

$ vim stateful-set.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: my-stateful-set
spec:
  selector:
    matchLabels:
      app: service-label    # has to match .spec.template.metadata.labels
  serviceName: "my-service"   # has to match the created service name.
  replicas: 3   # Number of Pod replicas to be created. It is 1 by default
  template:
    metadata:
      labels:
        app: service-label # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: my-volume
          mountPath: /data/path
  volumeClaimTemplates:
  - metadata:
      name: my-volume
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "my-storageclass"    #must match the created StorageClass name
      resources:
        requests:
          storage: 500Mi      # storage request

Create the StatefulSet using the kubectl create command:

$ kubectl create -f stateful-set.yaml
statefulset.apps/my-stateful-set created

You can view the processes of how the Pods are being created using kubectl get pods -l app=service-label -w command where app=service-label is the same as the label used in the manifest file. You also need to install tmux to divide your terminal into two, which will allow you to run the commands in the first terminal and watch the processes in the second. You must first run the above command on the first terminal to watch the processes before running the command to create the StatefulSet on the second. If everything works well, the viewed terminal output should look like this:

NAME    		   READY       STATUS    	   RESTARTS    AGE
my-stateful-set-0   	    0/1       Pending                 0         0s
my-stateful-set-0   	    0/1       Pending                 0         6s
my-stateful-set-0   	    0/1       ContainerCreating       0         6s
my-stateful-set-0   	    0/1       ContainerCreating       0        24s
my-stateful-set-0   	    1/1       Running		      0        36s
my-stateful-set-1   	    0/1       Pending      	      0         0s
my-stateful-set-1   	    0/1       Pending      	      0         6s
my-stateful-set-1   	    0/1       ContainerCreating       0         6s
my-stateful-set-1           0/1       ContainerCreating       0        12s
my-stateful-set-1   	    1/1       Running                 0        15s
my-stateful-set-2           0/1       Pending        	      0         0s
my-stateful-set-2   	    0/1       Pending                 0         6s
my-stateful-set-2   	    0/1       ContainerCreating       0         6s
my-stateful-set-2           0/1       ContainerCreating       0        23s
my-stateful-set-2   	    1/1       Running      	      0        26s

The above output shows the order in which the Pods are created. Use kubectl get command to check the Pod status to see if they are ready.

Step 7: Check the details of the StatefulSet, Pods, PV and PVCs using kubectl get command:

Check the details of the StatefulSet:

$ kubectl get statefulset my-stateful-set

NAME                  READY   	  AGE
my-stateful-set        3/3       5m25s

The output shows that the 3 Pod replicas have been created and ready.

Check the status of the Pods:

$ kubectl get pods

NAME                	READY  	  STATUS    	 RESTARTS     AGE
my-stateful-set-0       1/1       Running   	    0         7m6s
my-stateful-set-1       1/1       Running           0         6m44s
my-stateful-set-2       1/1       Running           0         6m26s

Check the status of the PersistentVolume:

$ kubectl get pv

NAME              CAPACITY  ACCESS MODES  STATUS                 CLAIM                   STORAGECLASS      AGE
pvc-1dcfd012        1Gi         RWO       Bound    default/my-volume-my-stateful-set-1   my-stg-class      5m8s
pvc-2ceee9d3        1Gi         RWO       Bound    default/my-volume-my-stateful-set-2   my-stg-class      4m49s
pvc-8cfe94f5        1Gi         RWO       Bound    default/my-volume-my-stateful-set-0   my-stg-class      5m30s

Check the status of the PersistentVolumeClaim:

$ kubectl get pvc

NAME   	                     STATUS      VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS      AGE
my-volume-my-stateful-set-0   Bound    pvc-8cfe94f5    1Gi           RWO        my-stg-class     34m
my-volume-my-stateful-set-1   Bound    pvc-1dcfd012    1Gi    	     RWO        my-stg-class     33m
my-volume-my-stateful-set-2   Bound    pvc-2ceee9d3    1Gi   	     RWO        my-stg-class     33m

In the above output, the Pod names have specific numbers attached to them from 0 - 2, unlike Deployment where random numbers and letters are attached to a Pod name. Moreover, the Pods are created sequentially. The Pod “my-stateful-set-0” is created first, followed by “my-stateful-set-1” and then “my-stateful-set-2”. If you don’t want them labelled sequentially, you can include a “podManagementPolicy” property in the StatefulSet YAML file with its value set to “parallel”.

StatefulSet Use Case

We will test the use case of StatefulSet using the following steps: getting into one of the Pods, creating a file and deleting the Pod. The Pod will be recreated using replicas. We will then check if it has the same name as the one that was deleted and if the data created earlier still exists in the Pod.

Step 1: Exec into one of the pods using the kubectl exec command. Pod “my-stateful-set-1” will be used in this case.

$ kubectl exec -it my-stateful-set-1 -- bin/bash
root@my-stateful-set-1:/#

Step 2: Change to the directory where the volume is mounted, create and save a file into the directory:

root@my-stateful-set-1:/# cd data/path
root@my-stateful-set-1:/data/path# echo This is a StatefulSet Message > stset.txt
root@my-stateful-set-1:/data/path# cat stset.txt
This is a StatefulSet Message
pod "my-stateful-set-1" deleted

Step 3: Exit from the Pod, delete the Pod and let it recreate:

$ kubectl delete pod my-stateful-set-1
pod "my-stateful-set-1" deleted

Check the Pod status to see if they are up again:

$ kubectl get pods 

NAME                	READY   STATUS      RESTARTS   	 AGE
my-stateful-set-0   	 1/1    Running         0        33m
my-stateful-set-1        1/1    Running         0         6s
my-stateful-set-2        1/1    Running         0        33m

The output shows that the Pod is up again with the same name. One way to know this is to check the difference in the “AGE” column of all the Pods.

Step 4: Exec into the Pod once again and check the data created earlier before the Pod was deleted:

$ kubectl exec -it my-stateful-set-1 -- bin/bash
root@my-stateful-set-1:/# cd data/path
root@my-stateful-set-1:/data/path# ls
stset.txt
root@my-stateful-set-1:/data/path# cat stset.txt
This is a StatefulSet Message
root@my-stateful-set-1:/data/path# exit

Scaling StatefulSet up or Down

You can scale StatefulSet up or down by running the below command on the terminal. The value depends on how many Pod replicas you need.

To scale down:

We will scale down the previous Statefulset from 3 to 1 replica using kubectl scale command and also watch the process to see the scaling order.

$ kubectl scale statefulset my-stateful-set --replicas=1

Check the StatefulSet and Pod status with kubectl get command:

$ kubectl get statefulset

NAME              READY   AGE
my-stateful-set    1/1     6m

$ kubectl get pod

NAME                   READY    STATUS      RESTARTS     AGE
my-stateful-set-0      1/1      Running         0        15m

To Scale-up:

The StatefulSet will be scaled-up from 1 to 6 replicas.

$ kubectl scale statefulset my-stateful-set --replicas=6

Check the StatefulSet status:

$ kubectl get statefulset

NAME              READY    AGE   
my-stateful-set    6/6     102m

Use kubectl get command to check the status of the Pods:

$ kubectl get pods

 NAME                	READY      STATUS    RESTARTS    AGE
my-stateful-set-0   	 1/1       Running   	0        59m
my-stateful-set-1   	 1/1       Running   	0        59m
my-stateful-set-2   	 1/1       Running   	0        59m
my-stateful-set-3   	 1/1       Running  	0       2m47s
my-stateful-set-4   	 1/1       Running   	0       2m32s
my-stateful-set-5   	 1/1       Running  	0        2m5s

The above output shows that the pods start terminating from my-stateful-set-5 to my-stateful-set-1 to scale the replica down to 1.

Clean up:

Delete the StatefulSet, Service, StorageClass and PersistentVolumeClaims using kubectl delete commands in that order. You can view the process while deleting the StatefulSet to see the order in which the Pods terminate.

$ kubectl delete statefulset my-statefulset  

NAME               READY     STATUS    	RESTARTS       AGE               
my-stateful-set-0   1/1     Running   	    0        4h59m             
my-stateful-set-1   1/1     Running  	    0          23m               
my-stateful-set-2   1/1     Running   	    0          23m               
my-stateful-set-3   1/1     Running   	    0          23m               
my-stateful-set-4   1/1     Running         0          15m               
my-stateful-set-5   1/1     Running   	    0          15m               
my-stateful-set-5   1/1     Terminating     0          16m           
my-stateful-set-2   1/1     Terminating     0          25m           
my-stateful-set-0   1/1     Terminating     0         5h1m          
my-stateful-set-3   1/1     Terminating     0          24m           
my-stateful-set-1   1/1     Terminating     0          25m           
my-stateful-set-4   1/1     Terminating     0          17m           
my-stateful-set-2   0/1     Terminating     0          25m
my-stateful-set-1   0/1     Terminating     0          25m
my-stateful-set-5   0/1     Terminating     0          16m
my-stateful-set-3   0/1     Terminating     0          25m
my-stateful-set-0   0/1     Terminating     0         5h1m
my-stateful-set-4   0/1     Terminating     0          17m
my-stateful-set-4   0/1     Terminating     0          17m
my-stateful-set-5   0/1     Terminating     0          16m
my-stateful-set-5   0/1     Terminating     0          16m
my-stateful-set-3   0/1     Terminating     0          25m
my-stateful-set-3   0/1     Terminating     0          25m

The output shows that the Pods are terminated concurrently without waiting for any others to complete the process.

Check the Pods status to see if they have been deleted:

$ kubectl get pods
No resources found in default namespace.

Summary:

There are some limitations with StatefulSets when compared to Deployment. There is the possibility that the Pods remain unterminated. So, you can scale down the StatefulSet replicas to 0 first and then delete the StatefulSet. Also, the StatefulSet’s volumes remain intact for data preservation until its PersistentVolumeClaims are deleted. The PersistentVolumeClaims, as well as other components, must be deleted manually.