April 13, 2021

Using ReadWriteMany Volumes on TKG Clusters

A quick blog to show how to use an open source project to mount NFS sub-volumes as ReadWriteMany PersistentVolumes in Kubernetes.

Introduction

I've had a number of customers ask me how they can use ReadWriteMany (RWM) volumes on vSphere with Tanzu, and the answer as of writing is that out of the box, there is no RWM capability exposed by default in vSphere with Tanzu. However, that is not to say that you can't get RWM volumes on vSphere with Tanzu!

ReadWriteMany volumes, for the uninitiated, are volumes that can be mounted in a Read/Write fashion simultaneously into a number of pods. This is particularly useful for web and app servers that serve the same files - but also for CI systems like Jenkins which can use a shared volume for artifact storage rather than unnecessarily duplicating data and impacting CI performance.

There are a number of ways you could achieve this today, and we're going to have a look at a community project affectionately named nfs-subdir-external-provisioner (https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner). This project is an interesting little CSI driver you can install into your K8s cluster to provision volumes as subfolders of an existing NFS share, meaning for every new PersistentVolume (PV) created - it creates a new folder on the share and mounts that subdirectory into the pod(s) - pretty simple.

Also, at this point I should mention that it is perfectly okay to have multiple CSI drivers installed in a Kubernetes cluster, they can expose different StorageClasses (SCs) and allow apps to request storage from different backends.

As with anything open-source that you install on your cluster you are responsible for support of that component, while VMware supports the built in vSphere CSI driver, any workloads deployed on K8s are supported by the customer directly.

Setup

The setup of the externnal provisioner is two steps - first, you need to create an NFS share, for this we are going to use vSAN File Services, simply because I already have vSAN in my lab and its a bit of a no-brainer to use what is already present in the environment.

vSAN File Share Creation

So in our vSAN cluster (after you've enabled vSAN FS of course), navigate to Cluster -> Configure -> vSAN -> File Shares -> Add and fill in the details as below, simply just give the file share a name and choose whether you want to add a quota or not (i've chosen not to for this example).

Add vSAN File Service Share

On the next screen i've chosen to allow access from all IP addresses - of course you probably want to restrict this to just the subnet of your TKG nodes that you set up as part of vSphere with Tanzu - but for demonstration, this works just fine.

vSAN File Service ACL

Click Finish and vSAN FS will create your NFS share.

Note: K8s of all flavours, upstream or otherwise have no support for kerberos auth or encryption over the wire of the mounted NFS shares - this is just inherent in Kubernetes today so the only way to restrict access is via IP ACLs.

Now we can retrieve the NFS endpoint from the share in the UI and we will use that in the next step to set up the external provisioner. To grab the NFS share path open the details box on the file share we just created and copy NFS 4.1 export path value, as below.

vSAN File Service NFS path

Setting up the nfs-subdir-external-provisioner CSI Driver

There are a few ways to set up the nfs-subdir-external-provisioner CSI Driver - the easiest is probably to use Helm and that is the path I chose, mainly because I use Helm throughout my ArgoCD based CD pipeline for my Tanzu cluster (https://argocd.tanzu.blah.cloud/applications/nfs-subdir-external-provisioner).

I have created a very simple Helm values.yaml file which contains all the configuration needed to spin up the provisioner app - you can find the full set of valid values for the values.yaml file on GitHub here.

The configuration I'm using on my vSphere with Tanzu cluster can be seen below - we'll walk through each field. Additionally, you can find my full ArgoCD setup including this example on GitHub here.

nfs:
  # The NFS server endpoint that we got from vSAN FS
  server: vsan-fs01.shed.net
  # The NFS share mount path from vSAN FS
  path: /vsanfs/K8s
storageClass:
  # The name of the StorageClass to be created on the K8s cluster to allow provisioning of RWM volumes from the share
  name: nfs-external
  # The accessMode that we want to be created from these subvolumes (ReadWriteMany allows multiple containers to mount it at once)
  accessModes: ReadWriteMany
podSecurityPolicy:
  # Enabling the pod security policy allows it to run on TKGS clusters out of the box
  enabled: true

So i'll assume you've added the above to a file called values.yaml - we can then deploy the provisioner to our TKG cluster, once we've logged into it, with a few simple Helm CLI commands:

helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
helm repo update
kubectl create namespace infra
helm install nfs-subdir-external-provisioner --namespace infra nfs-subdir-external-provisioner/nfs-subdir-external-provisioner -f values.yaml

The above commands will add the helm chart repo to your local machine's list of repositories, then pull the charts available from it (like doing an apt update on Ubuntu). We then create a Namespace in the TKG cluster called infra and finally, we install the helm chart with the values.yaml file we created above for our environment.

At this point helm will deploy the application and you can monitor its progress with a combination of helm list -n infra and kubectl get all -n infra - these commands will show you the current status of the driver deployment. Once everything is in a Running state, we're good to deploy some RWM volumes!

Verification

I've created some example yaml manifests to spin up a single RWM volume from the newly created nfs-external StorageClass, and then mount two different pods into the volume at the same time - you can inspect them here and here.

Let's apply create the PersistentVolumeClaim (PVC) on your cluster with the following:

kubectl apply -f https://github.com/mylesagray/tanzu-cluster-gitops/blob/master/manifests/nfs-subdir-external-provisioner/templates/example-pvc.yaml 

And inspect that it was created (should be showing a status of Bound):

$ kubectl get pv,pvc -n infra
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                                                     STORAGECLASS                  REASON   AGE
persistentvolume/pvc-ef412a4f-5750-456c-8333-bd095980e9a2   10Gi       RWX            Delete           Bound    infra/external-nfs-pvc                                                                                    nfs-external                           7m37s

NAME                                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/external-nfs-pvc   Bound    pvc-ef412a4f-5750-456c-8333-bd095980e9a2   10Gi       RWX            nfs-external   7m37s

Note: this volume will not show up in the Cloud Native Storage (CNS) UI as it was provisioned using the newly installed CSI driver - not the vSphere CSI driver which has CNS integration.

Now let's create the pods that will simultaneously mount the volume:

kubectl apply -f https://github.com/mylesagray/tanzu-cluster-gitops/blob/master/manifests/nfs-subdir-external-provisioner/templates/example-pod.yaml 

Again, we can verify they are running and mounted successfully by checking both pods are in a state of Running with:

$ kubectl get po -n infra
NAME                                               READY   STATUS    RESTARTS   AGE
external-nfs-pod                                   1/1     Running   0          34s
external-nfs-pod-2                                 1/1     Running   0          36s
nfs-subdir-external-provisioner-7fc866d687-xsggg   1/1     Running   0          12m

To go a little further, we can actually get a terminal in those pods and add some content to the PVC and see that it is reflected in the other pod as they share the volume.

On the first pod:

$ kubectl exec -n infra -it external-nfs-pod -- /bin/bash
root@external-nfs-pod:/# cd /usr/share/nginx/html/
root@external-nfs-pod:/usr/share/nginx/html# touch index.html
root@external-nfs-pod:/usr/share/nginx/html# echo 'Hi' > index.html
root@external-nfs-pod:/usr/share/nginx/html# apt update && apt install curl
root@external-nfs-pod:/usr/share/nginx/html# curl localhost
Hi
root@external-nfs-pod:/usr/share/nginx/html# exit

On the second pod:

$ kubectl exec -n infra -it external-nfs-pod-2 -- /bin/bash
root@external-nfs-pod-2:/# apt update && apt install curl
root@external-nfs-pod-2:/# curl localhost
Hi
root@external-nfs-pod-2:/# echo 'Really!' > /usr/share/nginx/html/index.html
root@external-nfs-pod-2:/# curl localhost
Really!
root@external-nfs-pod-2:/# exit

And back on the first pod:

$ kubectl exec -n infra -it external-nfs-pod -- /bin/bash
root@external-nfs-pod:/# curl localhost
Really!
root@external-nfs-pod:/usr/share/nginx/html# exit

So we can see that files that we create and update on one pod are available on the other, and changes we make in either pod are reflected in both - this makes sense as they're sharing the same volumes and proves that everything is working as planned!

It's also helpful to note that as the vSAN FS share allows for all the persistent volumes to be created as sub-directories of that single share it scales very well as instead of creating a share per volume, it just creates a folder inside the share for each volume which results in very fast volume creation and is great for deployments using many small PVs.

Filter Tags

Modern Applications Tanzu Kubernetes Grid vSAN 7 vSphere with Tanzu Cloud Native Storage Container Storage Interface Kubernetes Tanzu Kubernetes Grid vSAN File Services Blog Operational Tutorial Intermediate Manage