By Michael West, Technical Product Manager, VMware
DevOps automation is critical to the implementation of modern applications. The Kubernetes ecosystem is rapidly becoming the baseline infrastructure supporting this automation. vSphere 8 exposes a Kubernetes API that can not only be consumed directly but enables the TKG and VMService. In Part 1 of this blog series I demonstrated the capability to automate the lifecycle of VM instances using VMware curated base images and cloud-init for instance configuration. Developers require more choice and in vSphere 8, VMware has extended the VMService to enable the consumption of custom images in your VM deployments. This blog will show you how to Bring Your Own Image (BYOI) and enable instance configuration through a native cloud-init datasource. The VMService is about enabling Developer automation, so we will deploy a VM with a Tensorflow Jupyter Notebook enabled. For those of you that wish to jump straight to a video demonstration, here you go.
As seen in part 1 of this blog series Here, deployment of VMs was done using a specific image curated by VMware and available in the Marketplace Here. Instance configuration at initial boot was done through Cloud-Init and used OvfEnv as the transport protocol. Guest OS Customization (GOSC) runs to handle the VM Network config. As you saw, OvfEnv was designed to primarily support a more static methodology of deploying OVA images configured with bootstrap metadata stored as properties in the VM’s vApp Options. We were able to piggyback cloud-init onto this by storing that metadata as User-Data in those properties. To more fully support the automation required by modern Developers, we have implemented a native approach to instance bootstrap and configuration through the CloudInit Transport.
Let's see what this looks like by starting with the Virtual Machine YAML specification used by the VMService to deploy the VM. The only real changes from the OvfEnv configuration are that we specify CloudInit as the Transport in the vmMetadata section and the image is one that was built from a download of Ubuntu 22.04 (Jammy Jellyfish) instead of the curated VMware built image. Note: For this demo I manually installed Jammy into a VM, exported it to OVF and then added it to a Content Library. In Part 3 of this blog series I will go through an automated build process with Packer/Ansible.
apiVersion: vmoperator.vmware.com/v1alpha1 kind: VirtualMachine metadata: labels: vm-selector: vmware-tanzu-jumpbox3 name: vmware-tanzu-tensor namespace: tkg spec: imageName: ubuntu-jammy-vm2 className: best-effort-xsmall powerState: poweredOn storageClass: k8s-shared advancedOptions: defaultVolumeProvisioningOptions: thinProvisioned: true volumes: - name: tensor persistentVolumeClaim: claimName: tensor-pvc3 networkInterfaces: - networkType: vsphere-distributed networkName: k8s-workload readinessProbe: tcpSocket: port: 22 vmMetadata: configMapName: tensor-configmap transport: CloudInit
What are the requirements for an image to be supported by the VMService? First of all, when CloudInit is the transport, OVF Properties are disabled on the VM to remove any possibility of the race condition described in Part 1 of this blog. The cloud-init data is stored in the GuestInfo Properties of the VM’s extraConfig. If interested, you can see this by logging into the Managed Object Browser (Mob) at https://yourVC/mob, find your VM and then click on config. You will see the extraConfig Name and a list of values. The cloud-init metadata is stored in extraConfig["guestinfo.metadata"] and extraConfig["guestinfo.userdata”]. Metadata contains the Networking and DNS info populated by the VMService, while userdata contains the cloud-config defined in the Developers VM specification. You can also see this data in gzip +base64 encoding in the .vmx file of the VM.
VMTools is required in the Guest OS to read the Cloud-Init data from the .vmx file. The image can contain either VMTools or Open VM Tools to accomplish this task. Cloud-init requires a Datasource to understand where this data resides and how to apply it. Cloud-init datasources are platform specific and are able to return a JSON spec by querying and parsing the data from the extraconfig fields. Prior to cloud-init 21.3, the VMware Datasource had not been merged into core Cloud-Init, so Cloud-Init DataSource for VMware GuestInfo needed to be added to your image. Now DataSource for VMware is standard within Cloud-init. So in order to have a VMService compatible image deployed by the VMService, you need that image to contain the following:
1) VM Tools or Open VM Tools
- Cloud-init 21.3 or later
- Cloud-init 17.9 or later with Cloud-Init DataSource for VMware GuestInfo.
Note that DataSource for VMware GuestInfo is deprecated and you should upgrade Cloud-init
If using the manual installation method for creating the image, run the following commands prior to shutting down the VM for export in order to make sure that you get a clean cloud-init run when the image is deployed.
- rm -f /etc/cloud/cloud.cfg.d/99-installer.cfg
- rm -f /etc/cloud/cloud.cfg.d/subiquity-disable-cloudinit-networking.cfg
- /usr/bin/cloud-init clean --logs
See this blog if you would like more information on manually creating the VM and uploading the image.
Enabling Tensorflow Jupyter Notebook through Cloud-Init
Jupyter Notebook is a user friendly way to run Tensorflow for Machine Learning on a dataset and graphically see the results. The setup through cloud-config is pretty straightforward. I will create a vmware user in the VM and add it to the docker group. The packages module defines the libraries to be downloaded. As in the previous blog demo, I create a Persistent Volume and do the disk and filesystem setup. That disk is mounted on /data and in the runcmd module I bind /data/docker so that docker will write its images to this larger disk. After executing the docker install, we run the Tensorflow app in a Docker container and expose the container port 8888 onto host port 8888. Very clean setup.
ssh_pwauth: true users: - name: vmware sudo: ALL=(ALL) NOPASSWD:ALL groups: docker lock_passwd: false # Password set to Admin!23 passwd: '$1$salt$SOC33fVbA/ZxeIwD5yw1u1' shell: /bin/bash apt: sources: docker.list: source: deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable keyid: 9DC858229FC7DD38854AE2D88D81803C0EBFCD88 packages: - docker-ce - docker-ce-cli - containerd.io disk_setup: /dev/sdb: table_type: gpt layout: True overwrite: True fs_setup: - device: /dev/sdb filesystem: ext4 partition: 1 mounts: - [ /dev/sdb1, /data, "auto", "defaults,noexec,nofail" ] runcmd: - mkdir /data/docker - mount --bind /data/docker /var/lib/docker - apt install docker.io - systemctl restart docker - docker run --publish 8888:8888 harbor-repo.vmware.com/dockerhub-proxy-cache/tensorflow/tensorflow:latest-gpu-jupyter &
New Developer Console Access
New to the VMService is the capability to access the console of any VM without needing to have vCenter access. This enables developers to troubleshoot VMs that may not have booted successfully because of issues with their cloud-config schema. The process is enabled through the kubectl vsphere plugin. Simply execute kubectl vsphere vm web-console "vm name" and receive a link to the web-console.
tensor-vm: kubectl vsphere vm web-console vmware-tanzu-tensor Successfully created a new WebConsoleRequest 'vmware-tanzu-tensor-5k5lk' in namespace 'tkg' Waiting for the above WebConsoleRequest to be processed... Web-Console URL: https://192.168.220.130/vm/web-console?host=192.168.110.51&namespace=tkg&port=443&ticket=a642b86b556ebf17&uuid=822b0aae-c106-45bd-827d-14d79828bf59 This URL is for one-time use and will expire at 2023-03-09T05:56:24-07:00 (in about 2m0s) tensor-vm:
Depending on your setup, clicking on the link or pasting into a browser will open the web console to the VM. Two things to note in the console, you can see the layers of the Tensorflow Docker image downloading and you see links with an access token that allows ingress to the Tensorflow Notebook.
Verify Tensorflow Notebook has been successfully enabled.
We ssh into the VM using the credentials created through the cloud-config. Docker ps shows that the Tensorflow Notebook is running, but just to prove it, we docker exec into the container and see the Tensorflow banner.
vmware@vmware-tanzu-tensor:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e5df4f9618cc harbor-repo.vmware.com/dockerhub-proxy-cache/tensorflow/tensorflow:latest-gpu-jupyter "bash -c 'source /etâ¦" 22 hours ago Up 22 hours 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp modest_bose vmware@vmware-tanzu-tensor:~$ docker exec -it e5df4f9618cc bash ________ _______________ ___ __/__________________________________ ____/__ /________ __ __ / _ _ \_ __ \_ ___/ __ \_ ___/_ /_ __ /_ __ \_ | /| / / _ / / __/ / / /(__ )/ /_/ / / _ __/ _ / / /_/ /_ |/ |/ / /_/ \___//_/ /_//____/ \____//_/ /_/ /_/ \____/____/|__/ root@e5df4f9618cc:/tf#
Finally we use the VMIP:8888 with the token you see in the web-console above to access the app from our browser. https://VMIP:8888/?token="Token"
For more detail on configuring your VM through the process above, check out this video:
Part 1 of this Blog Series is available here