How to deploy Postgres on Kubernetes with Skaffold
Hello again! In this part of the series, we'll finally get our hands dirty using Skaffold to build, push and deploy applications on Kubernetes. We'll deploy a Postgres database on our local Minikube cluster. Along the way, we'll learn Kubernetes concepts such as ConfigMaps, Secrets, Persistent Volumes and Persistent Volume Claims, StatefulSets, and Services.
Update 2023: This post is old. I recommend using Tilt instead of Skaffold. Proceed with caution.
At this point of the series, it's worth reminding why we're using Skaffold in the first place. The main reason is that we want to get rid of Docker Compose commonly used for kicking up the local development environment. At the end of the day, we want to deploy our app to Kubernetes using Kubernetes manifests instead of Docker Compose. We want to use the same Kubernetes manifests both for local development as well as for deploying our app to production. Here's where Skaffold enters the game. It takes care of, for example, hot reloading Docker images when developing locally: whenever your code changes, Skaffold re-builds images and re-deploys them to local Kubernetes. This is how we can build truly Kubernetes-native applications. Of course, Skaffold can do much more than hot reloading for local development.
In Part 1, we installed all dependencies required for this tutorial. Most notably, we'll need a Skaffold installation and a Kubernetes cluster. I assume you're using Minikube, but you could also use other local clusters such as Docker Desktop or even a remote cluster.
You can find all code accompanying this series in this GitHub repository.
First we'll need to configure Skaffold by creating a
skaffold.yaml file in our repository. I suggest you take a quick glance at the link to get an overview of Skaffold's configuration.
Like any resource definition in Kubernetes,
metadata fields. We therefore add the following to
The meat of Skaffold is in
deploy fields. For now, we assume there are no artifacts to build and add the following below the definition above:
deploy, we tell Skaffold where to find the Kubernetes manifests and how to process them. We'll tell Skaffold to deploy with
kubectl, look for manifests in a folder called
k8s/ and to use the
minikube context for deployment with the following configuration:
kubeContext are set to above values by default, but I think it's always better to be explicit with such things. Instead of deploying bare Kubernetes manifests with
kubectl, you could also tell Skaffold to process your manifests with
kustomize or use
Instead of writing
skaffold.yaml yourself, you can also use
skaffold init command to auto-generate
We'll put our Kubernetes manifest for the Postgres deployment in
k8s/postgres.yaml. We'll first include non-confidential Postgres configuration data in a ConfigMap. By storing configuration in ConfigMap, we can easily re-use the configuration in other services using our Postgres cluster.
For Postgres, we'll need to define the Postgres user name and the database to use. We add these as configuration variables
We'll see later how to use this configuration.
Postgres also wants us to configure the password that can be used to access the database. Such confidential data should be stored as Kubernetes Secrets. Kubernetes wants us to base64-encode our secrets, so we'll have to do that first. Assuming that we choose
"super-secret" as our password, here's how to base64-encode:
For the purposes of this tutorial, I'll simply add the base64-encoded secret in
In any real-world use, you probably wouldn't add secrets like this to version control. You would put the secrets in their own
secrets.yaml file and keep it out of version control or read secrets at deployment time from services such as Vault.
At this point, we can see if our Skaffold deployment works by running
skaffold dev. Skaffold should discover
postgres.yaml and deploy our ConfigMap and Secret. If you then open Minikube dashboard with
you should find the ConfigMap and Secret in the created resources. If you modify your deployment manifests, Skaffold should automatically take care of updating the deployment so from now on, you can keep
skaffold dev running in the background.
Persistent volumes and volume claims
Next we'll setup volumes for our Postgres installation. This part is not required if you only use Postgres for local development with small data volumes. If you're deploying Postgres for production in Kubernetes, you need to think carefully where and how to persist data.
Persistent volume (PV) documentation describe it as "a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes." Storage classes can be used to provision storage on-demand from, say, Amazon Elastic Block Store or Google Cloud Persistent Disk. In this tutorial, we'll provision local storage upfront without storage classes.
In this tutorial, we'll provision storage from the host with hostPath persistent volume. As the docs explain, "In a production cluster, you would instead provision a network resource like a Google Compute Engine persistent disk, an NFS share, or an Amazon Elastic Block Store volume."
Here's how to use hostPath to provision storage from the single-node cluster's host:
We specify that a persistent volume called
postgres-pv of 1 GB capacity should be provisioned at
/mnt1/postgres-data on the cluster's node. Access mode is defined as
ReadWriteOnce, meaning that the volume can be mounted by a single node for reading and writing.
Once we get Postgres running, we can ssh into your Minikube node with
minikube ssh and execute
ls /mnt1/postgres-data to browse the data stored by Postgres.
Now that we have provisioned storage, we need to create a persistent volume claim to request access to it. Volume claims are similar to how pods request resources from nodes. Instead of requesting CPU or memory, volume claims request specific storage size ("5 GB") and specific access mode. Let's create a volume claim to request 500 MB from
Now we can use this claim named
postgres-pvc to mount the claimed storage to our Postgres pod.
We're ready to declare how to deploy Postgres itself. Just for fun, we'll deploy Postgres using StatefulSet. StatefulSet manages not only the deployment and scaling of pods but also "provides guarantees about the ordering and uniqueness of these pods". In particular, StatefulSet provides unique network identifiers, persistent storage, graceful deployment and scaling, and automated rolling updates. For example, if you were deploying Apache Cassandra, an open-source distributed database, on Kubernetes, StatefulSet would guarantee your pods would have stable identifiers such as
cassandra-2, etc. you could use to communicate between pods even after rescheduling. See here for an example how to deploy Cassandra on Kubernetes.
In our case of Postgres, we are dealing with a centralized, non-distributed database, so we'll only have one pod in our StatefulSet. However, StatefulSet is still a very useful and powerful concept for any stateful application.
We'll declare StatefulSet as follows:
As expected, we use
kind: StatefulSet. In
spec, we declare
serviceName to be
postgres. You can read here how
serviceName controls the DNS for pods in the StatefulSet. Do not confuse this with how you would use Service to expose Postgres to other applications in the cluster or to the outside world. In the next section, we'll see how to use Service to expose the StatefulSet as a network service.
serviceName, we set
replicas equal to one as we're dealing with a non-replicated, centralized database. The
.spec.selector field defines which pods belong to the StatefulSet.
The spec for pods in the StatefulSet is defined in
.spec.template as follows:
# k8s/postgres.yaml ... spec: ... template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:12 envFrom: - configMapRef: name: postgres-configuration env: - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: postgres-credentials key: password ports: - containerPort: 5432 name: postgresdb volumeMounts: - name: postgres-volume-mount mountPath: /var/lib/postgresql/data readinessProbe: exec: command: - bash - "-c" - "psql -U$POSTGRES_USER -d$POSTGRES_DB -c 'SELECT 1'" initialDelaySeconds: 15 timeoutSeconds: 2 livenessProbe: exec: command: - bash - "-c" - "psql -U$POSTGRES_USER -d$POSTGRES_DB -c 'SELECT 1'" initialDelaySeconds: 15 timeoutSeconds: 2 volumes: - name: postgres-volume-mount persistentVolumeClaim: claimName: postgres-pvc
We first set pod labels to
postgres to match the selector in the StatefulSet. In
.spec.template.spec, we define our pods to have a single container running
postgres:12 image from DockerHub. The pod uses environment variables from the
postgres-configuration ConfigMap and
postgres-credentials Secret, both of which we defined above. We then set the pod to expose port 5432. In
volumeMounts, we mount a volume named
/var/lib/postgresql/data, the folder used by Postgres for storing data. This volume is defined in
.spec.template.spec.volumes, where we use the
postgres-pvc volume claim defined above.
Finally, we also set
readinessProbe as well as
livenessProbe to let Kubernetes monitor the pod's health. You can read more about probes here.
Having declared our StatefulSet, we'll finally need to define a Service to expose the StatefulSet as a network application to other applications. Compared to above, this is simple enough:
In our service, we expose port 5432 from the pod as
NodePort. You can read more about different options here. Choosing
NodePort will let us contact the service from outside the cluster. In production, you of course very carefully want to consider if you want your database to be exposed outside of the cluster.
Did it work?
If you had
skaffold dev --port-forward running throughout this tutorial, you should now have Postgres running in your Minikube cluster. You can verify this by, for example, running
minikube dashboard to browse the resources in Minikube and hopefully see everything glowing green. If anything went wrong, please let me know in the comments!
That concludes the tutorial on how to get Postgres running on Kubernetes. In the next and final part, we'll finally deploy the Django application we built in Part 2, backed by Postgres. See you then!