VMware Data Services Manager - Proof of Concept (PoC) Guide

1.4

Purpose

The Purpose of this guide is to capture the considerations and requirements for a successful proof of concept of VMware Data Services Manager (DSM) at a customer site. Make it easy for non DSM experts to be able to deploy and demonstrate DSM functionality to customers.

Objectives

  1. Successfully Deploy and Configure a Provider Appliance

  2. Successfully Deploy and Configure an Agent Appliance, and complete DSM configurations

  3. Successfully Create a Standalone PostgreSQL database

  4. Successfully Create a Standalone MySQL database

  5. Successfully Cluster a PostgreSQL database

  6. Successfully Cluster a MySQL database

Requirements

A list of requirements that need to be in-place before commencing a proof of concept.

vSphere

  • An vSphere 6.7 or vSphere 7.x environment is required, as per VMware DSM v1.4 Release Notes show the currently supported vSphere platforms.
  • There should be at least one vSphere Cluster configured and manged by vCenter. With DSM v1.4, you can optionally build a vSphere Resource Pool and use those resources for database deployments.

DSM will need 2 vCenter SSO (single sign on) users configured, each with different roles & privileges. One of the users is used for management. The other user is used for monitoring, so this second user is configured as a read-only user. For details on the privileges required for each role, see the official documentation.  These users must be created in vCenter Server before commencing the DSM agent on-boarding process. It is during the DSM agent on-boarding process that these SSO users are required. This on-boarding will be described in detail later in the PoC guide.

Networking Requirements

A total of 3 virtual networks are required to successfully deploy DSM.

Management network -This is where the vCenter server and ESXi hosts reside. The DSM Provider will also be connected to this network. There is an expectation that the management network facilitates the establishment of a secure connection to the Tanzu network portal.

Control Plane / RabbitMQ network - This network is where the DSM Provider, DSM Agents and Database VMs communicate. The DSM Provider is multi-homed, so will have another connection to this network as well as the Management network. The Agents only have a single network connection, and are connected to this network only. The database VMs are also multi-homed, with a connection to the control plane network and the application network.

Application network - Database VMs have a connection to this network as well as to the control plane network. This is the network on which clients/end-users connect to the database.

Network Routing - Each of the networks must be routed. The control plane network needs a route to the management network. Agent needs to be able to connect to vCenter to discover environments. Since the Agent VM is not on Management network, it need to route to the vCenter via the control plane network.

DNS Requirements for PostgreSQL clustering

If setting up HA/Clustering of databases, then the provider/agents need to resolve the FQDN of the database primary and replicas (and PG_Monitor in the case of PostgreSQL) via the application network. For DSM, the current recommendation is to add a conditional forwarder to your central DNS for the provider/org domains. This should forwards DNS queries back to the DNS server on the Provider. This will resolve the names of any databases deployed in DSM. Therefore, if your central DNS is on the management network, a route is needed between application network and management network. If only deploying stand-alone database, then DNS is not needed on the application network.

DNS

DNS is only required when you wish to make highly available / clustered databases. If you only wish to deploy standalone databases, DNS is not a requirement.

DNS is must run on the application network. As highlighted in the previous networks section, the recommendation is to add a conditional forwarder for the provider and org domains to your central DNS which will forward queries for resolving database names back to the Provider. If the central DNS is not on the application network, then the application network needs to have a route to the network where the central DNS resides, e.g., the management network.

Do not use the .local as part of the full qualified domain name. Photon uses systemd-resolved as DNS resolver which doesn’t send a DNS query to the DNS server for any domain name ends with .local. This is the default behaviour, so if setting up clustered databases, DNS resolution will not work with .local.

If customers are not willing to add a conditional forwarder for the DSM FQDN Suffixes to their central DNS for the PoC, consider configuring a Linux VM with dnsmasq installed to provide DNS functionality. dnsmasq is a lightweight, easy to configure DNS forwarder, designed to provide DNS (and optionally DHCP and TFTP) services to a small-scale network. An online tutorial on how to setup DNS via dnsmasq is available here.

DHCP

DHCP is required on the control plane network and on the application network. Whilst the Provider and Agents get provided with static IP addresses, the database VMs that are deployed on these networks require DHCP. The database VMs pick up their IP address on the control plane and application network via DHCP.

Since clustered DB VMs need to do tasks such as resolve FQDNs, it may be necessary to add default gateways, NTP and DNS server information into the DHCP server if these do not already exist.

A DHCP server need not be located inside the layer 2 network used, but can leverage a DHCP relay agents on the router into the network to allow a remote DHCP server to provide DHCP services. This can be done with software routers such as NSX-T as well as with physical switches.

If customers are not willing to add the DSM environment to their central DHCP, consider once again configuring a Linux VM with dnsmasq installed to provide such functionality. An online tutorial on how to setup DHCP via dnsmasq is available here. The same server/VM can be configured to run both DNS and DHCP.

NTP

NTP, the network time protocol, should be available on all networks. There are many public NTP servers, as well as those internal within organization (e.g. pool.ntp.org). Check with the customer on what they want to use. Time synchronization is critical for clustered databases.

TLS Certificates

TLS is a new requirement in DSM v1.4 to secure Provider to S3 Object Store communication.

TLS is required to enable https communication between DSM and the S3 Object Store used to provide S3 buckets which are currently used extensively in DSM v1.4. This may necessitate the need to create a trusted certificate on the S3 Object Store if one does not already exist.  Configuration of certificate/key is discussed in the S3 storage section of this guide (if a new S3 Object Store needs to be setup at the customer site to stand up a PoC). When configuring the S3 buckets in the DSM UI, the provider admin is prompted to accept the certificate before the S3 buckets are added to the DSM configuration.

Networking and Object Storage Requirements

Tanzu Network Portal

In order to be able to download DSM updates and database templates from VMware, trust needs to be established between DSM and the Tanzu Network Portal. The Tanzu Network Portal is available at https://network.pivotal.io/. An account will be needed on the portal. After successfully signing into the portal, from the username drop-down menu, select 'Edit Profile'. In the section, , select the option to Request New Refresh Token. Save the token somewhere safe. This token can later be inserted into the DSM UI to enable secure updates from VMware for DSM.

Static IP addresses requirements (IPv4)

Static IP addresses are required for the following DSM components. Make a note of the static IP addresses before starting the deployment.

  • The Provider appliance requires a static IP address on the control plane network.
  • All Agent appliances require a static IP address on the control plane network. You will need at least one Agent appliance deployed.
  • Any HA/Clustered MySQL database requires a static (virtual) IP address on the application network. This is not a requirement for PostgreSQL database clustering, since it does not use the virtual IP technique for high availability. If deploying standalone databases, static IP addresses are not required since the IP addresses are allocated via DHCP. Although early access is provided to MS SQL Server, it does not support database clustering at this time.

S3 Storage

There is a significant dependency on S3 storage in DSM v1.4. The following are the buckets that are needed.

  1. Bucket for Provider Repository, where database templates are stored.
  2. Bucket for Provider Backups.
  3. Bucket for Provider Logs.
  4. Bucket for database backups (local copy)
  5. Bucket for database backups (cloud copy) - for a PoC, it is not necessary to have an actual cloud based S3. A local S3 bucket can be used for this configuration item.
  6. Bucket for agent to store local copy of database templates. Not that this requires one unique bucket per agent, so if the PoC involves multiple agents, additional S3 buckets will be required.

If the customer does not have an on-premises/local S3 object store, the next section will show how to provision a MinIO S3 Object Store in a Linux VM.

Deploying a MinIO S3 Object Store

As highlighted in the requirements, DSM v1.4 has a dependency on an S3 Object Store. This is a storage which many vSphere customers may not have on-premises in their datacenter. In which case, the PoC team will need to set one up locally at the customer site.

Quickstart for MinIO on Linux

MinIO provide a Quickstart guide, as per https://min.io/docs/minio/linux/index.html#quickstart-for-linux. This can be used to create a simple set of S3 buckets for the purposes of DSM testing during a PoC. You will need a VM running Linux to be available on the customer's environment in order to install MinIO.

Create a Public Certificate and Private Key for MinIO

Caution: Avoid using the certgen tool for creating public cert/private key (https://min.io/docs/minio/linux/operations/network-encryption.html). This caused issues which resulted in ERROR 'NoneType' object has no attribute 'decode' when trying to send backups to the TLS enabled S3 Object Store. This is because the MinIO certgen tool builds a certificate that does not include the Common Name (CN) entry. Instead, PoC staff should create an openssl.cnf file on the MinIO server as follows, where XX.XX.XX.XX is the IP address of the Linux VM where MinIO is installed. Adjust the other distinguished name entries as required.

[req]
distinguished_name = req_distinguished_name
x509_extensions = v3_req
prompt = no

 

[req_distinguished_name]
C = IE
ST = Cork
L = Cork
O = VMware
OU = CIBG
CN = XX.XX.XX.XX

 

[v3_req]
subjectAltName = @alt_names

 

[alt_names]
IP.1 = XX.XX.XX.XX
DNS.1 = XX.XX.XX.XX

Use the openssl command to create a public certificate and private key for MinIO, then copy them to ~/.minio/certs on the MinIO Server. This is the home directory of the user which is launching the MinIO service.

$ openssl req -new -x509 -days 3650 -key private.key -out public.crt -config openssl.conf

 

$ cp public.crt ~/.minio/certs/public.crt
$ cp private.key ~/.minio/certs/private.key

The MinIO S3 Object Store is now ready to be used with VMware Data Services Manager. Note that the MinIO web interface/console uses port 9001 by default, whereas the port used for bucket access is 9000.

Deploying Kubernetes and Ceph to provide S3 Object Store

If your customer is already familiar with Kubernetes, and is in agreement to provide a K8s cluster for the PoC, then another option is to use Ceph for the S3 Object Store.

Network Diagram

If the requirements have been followed, we can envision a starting infrastructure as something similar to the following.

The following tasks now need to be completed to successfully deploy the Provider. The provider appliance (shipped) as an OVA will connect to both the management and control plane networks.

Deployment

Deploy Provider OVA

 

The Provider Appliance is available for download on the VMware Customer Connect site. In this example, the OVA has been downloaded locally and will be selected as a local file during the deployment.

In this example, the following networks are chosen to deploy the provider:

  1. Management - VM-51-DPG - 192.168.200.0/24
  2. Control Plane / RabbitMQ - VM-32-DPG - 192.168.100.0/24

You will need a two static IP addresses for the Provider configuration - one on the management network and the other on the control plane network.

Start the process from the vSphere client by selecting the option to Deploy OVF Template from a cluster object in the vCenter inventory.

Select Template

 

Select a name and folder

Unless previously created, the only available folder will be Discovered virtual machines. You may wish to make the Provider VM name something simpler than the default shown below. You may also want to simplify this name if you wish to add the provider to DNS, and make the VM name match. The 1.4.0 part of the name is not ideal for DNS, which will probably look at these as part of the FQDN (full qualified domain name).

Select a compute resource

If selecting a Cluster as a compute resource, VMware vSphere DRS must be enabled on the cluster for initial placement. If not, an ESXi host will need to be selected.

Review details

You may ignore the warning about the advanced configuration options - this warning is expected.

Select storage

Select the vSphere datastore (storage) on which to provision the Provider. The Encrypt option is usually disabled. If you select an appropriate VM Storage Policy, it will organize the available datastores into compatible and incompatible (with respects to the storage policy). This can be useful for vSAN and VVol datastores to leverage particular features of the back-end storage. however, any of the displayed datastores may be chosen. In the example below, the vSAN Default Storage Policy was selected, and the vSAN datastore (shown as compatible with the policy) is selected. If you do not want Storage DRS to move the virtual machine disk to other datastores, the 'Disable Storage DRS' option may be checked.

 

Select networks

At this point, 2 different networks need to be chosen. The Provider will have a virtual NIC configured onto both networks. The Management Network will be the network on which the Provider User Interface (UI) will appear. The Control Plane Network, also known as the RabbitMQ network, is the network on which the Provider and Agents communicate.

 

Customize template

This is where the specific details of this provider are added. As shown, the first fields relate to the Provider Admin. This is the persona who will login to the Provider portal to begin the rest of the configuration. You should provide the appropriate email for the provider admin in your setup. The remaining settings in this wizard relate to passwords.

image-20230327130537-1

As the configuration continues, we reach the networking configuration. Remember that there are 2 networks on the provider, one for the management network and the other for the control plane network. Populate the relevant details about the networks, including gateways, DNS, NTP and of course the static IP addresses and netmasks for each of the network interfaces on the provider.

The final few entries in the customization section relate to the control plane network.

Ready to complete

If everything looks like it has been populated correctly, you can click on the Finish button and begin the deployment of the Provider.

 

image-20230327130846-2

Once the provider has been successfully deployed, power it on via the vSphere Client.

Appliance Initialization

Wait approximately 5-10 minutes for the Provider appliance to power up and configure itself. To confirm that the appliance is ready, you can open the VM console in the vSphere Client and monitor the progress. Once you observe the "Started Appliance Initialization script' message, you should be able to point a browser to the IP address of the provider using https:// and see the Provider login page.

 

Configure Dashboard

 When the login screen appears, login using the email address of the Provider admin that you added during the Customization template step of the Provider OVA deployment. You also added the password during this step. This will drop you into the Provider UI dashboard.

950x416

There is already one Organization created. This is the Provider Organization. It is built using the detail added during the Customization template step of the Provider OVA deployment. The 1 user which exists is the Provider Administrator. This is the user that you just logged in as. This Provider Administrator is part of the Provider Organization. We can now begin configuring VMware Data Services Manager.

Create Tenant Org

At this point, whilst still logged in as the Provider Admin, you can proceed with creating one or more organizations (or Orgs). Select Organization on the menuu in the lef hand side, followed by Create Organization in the Provider UI. You will need to provide the Org with a name, and decide who is the owner by adding an email address (this user is not automatically added as an Org User or Org Admin - this is an additional step that needs to be done next). You also have to provide a DB FQDN Suffix. This is the domain name that is used for the databases VMs when these are deployed in the organization. Lastly, you need to decide on a VM Configuration Mode. The default is 'Plans Mode' meaning that you will be only able to build database VMs that are listed in the VM Plans. You will not be able to create VMs with resources that are not listed in the plans. Selecting 'Free Mode' allows the end users to specify DB VM resources at deployment time.

In this example. I have create a tenant org called newdevs. The owner is marc@corlab.ie, the domain name is newdevs.internal. Therefore if a database virtual machine called db1 is deployed, it will have an fqdn of db1.newdevs.internal in this organization. Lastly, any DB VMs deployed in this organization can only have CPU & Memory resources which have been defined in a VM Plan.

512x392

Create Tenant Org Admin & User

As mentioned, the user is not added as part of creating an organization. This should be done next. From the dashboard, select Users followed by Create User in the Provider UI. The user role can be set as either an Org Admin or an Org User. The password which is supplied here is only valid for the first login. On first login, the user will need to change the password from the one which is provided initially here.

512x641

Create VM Plans

At this point, the Provider Admin can also build VM Plans. This prevent Org Users/developers creating monster VMs for their database VMs. Below is a selection of values I chose for demonstration purposes (Gold, Siver, Bronze). It would be interesting to discuss right-sizing of database VM resources with the customer when creating the VM Plans.

950x309

Configure Tanzu Net Token

Navigate to Settings. Select the Information tab. Note that the Tanzu Net Token is not configured. In order to be able to download updates from VMware, trust needs to be established between DSM and the Tanzu Network Portal. As part of the prerequisite steps, a UAA Token was created and stored safely. Click on Actions to add the token.

950x355

If the token is valid, the Tanzu Net Token should change to Configured, as shown below.

950x356

Configure S3 Storage

Navigate to Settings. Select the Storage Settings tab. Here is where we inform DSM about the various buckets that are available. As highlighted earlier, we deployed a MinIO Object Store on an Ubuntu Linux VM and added the certificate to support TLS. As of v1.4 of DSM, all bucket communication requires TLS. Note that there are 3 x Provider buckets that need to be configured. There is also a required to configure 2 x database backup buckets, one local and one cloud. However, there is no validation to check where the database backup buckets are located, so both local and cloud buckets could be placed on the same on-prem MinIO S3 Object Store for the purposes of the PoC. Click on the three vertical dots in the actions column to configure each of the external storages for the provider. To add database backup storages, click on the +Create link. You will be prompted to accept the certificate from the MinIO server on the first bucket addition. This certificate gets added to the list of Provider certs once accepted, and establishes secure communication between the Provider and S3 Object Store server. After adding the appropriate S3 entries, the Storage Settings should look similar to the following.

950x421

Publish Database Templates

Once the S3 provider buckets have been configured, and the Tanzu Net Token has been added, the database templates should begin to get downloaded to the Provider Repo bucket. Once downloaded, they should appear in the Database Templates view. You can now show customers how the Provider Admin can control database versions and vendors, as only Published databases become available for deployments. To begin with, database templates are Available. Click on the three vertical dots in the actions column for each template to Publish a database template. For the purposes of the Proof of Concept, publish all of the databases.

950x300

Environments

Environments do not become available until at least one agent has been deployed/onboarded. Since namespaces are built using Environments, it will also not be possible to create a namespace until an agent is deployed. When you check an Environment before an agent is onboarded, the following status of 'Manifest Not Processed' will be observed.

950x201

 

Deploy Agent OVA

If the Provider OVA deployment has been successful, we can envision a starting infrastructure as something similar to the following.

The following tasks now need to be completed to successfully deploy the Agent.

Download Agent Appliance

The Agent Appliance is available for download on the VMware Customer Connect site. In this example, the OVA has been downloaded locally and will be selected as a local file during the deployment. You will notice that the deployment is very similar to the Provider appliance. In this example, the following network is chosen to deploy the agent. The agent only needs to communicate to the Provider on the control plane network, but this network needs a route to the management network so that it can communicate to the vSphere infrastructure / vCenter server:

  1. Control Plane / RabbitMQ - VM-32-DPG - 192.168.100.0/24

You will need a static IP address available on the control plane network for the Agent configuration.

Start the process from the vSphere client by selecting the option to Deploy OVF Template from a cluster object in the vCenter inventory.

Select Template

 

Select a name and a folder

Provide a unique name for the agent appliance/virtual machine. If a folder is not selected, it is placed in the 'Discovered Virtual Machines' folder by default.

 

Select a Compute Resource

If selecting a Cluster as a compute resource, VMware vSphere DRS must be enabled on the cluster for initial placement. If not, an ESXi host will need to be selected.

Review details

You may ignore the warning about the advanced configuration options - this warning is expected.

 

Select storage

Select the vSphere datastore (storage) on which to provision the Provider. If you select an appropriate VM Storage Policy, it will organize the available datastores into compatible and incompatible (with respects to the storage policy). This can be useful for vSAN and VVol datastores to leverage particular features of the back-end storage. however, any of the displayed datastores may be chosen. In the example below, the vSAN Default Storage Policy was selected, and the vSAN datastore (shown as compatible with the policy) is selected.

 

Select networks

This is probably the most significant difference between the Provider deployment. An agent only requires a single network to be chosen. This is the Control Plane network, also known as the RabbitMQ network. This is the network on which the Provider and Agents communicate. It should match the control plane network chosen during the provider deployment.

Customize template

This is where the specific details of this agent are added. Here, details about the Provider are supplied.

One the provider details are added, you must add specific details about the agent itself, including its static IP address on the control plane network.

Ready to complete

If everything looks like it has been populated correctly, you can click on the Finish button.

Once the agent has been successfully deployed, power it on via the vSphere Client.

Appliance Initialization

Wait approximately 5-10 minutes for the agent to power up and configure itself. To confirm that the appliance is ready, you can open the VM console in the vSphere Client and monitor the progress. Once you observe the "Started Appliance Initialization script' message, you should be able to point a browser to the IP address of the agent using https:// and begin the agent onboarding process.

 

Agent configuration

 

To initiate the onboarding process, point a browser to the IP address or FQDN of the agent VM. Login as user root using the password that you provided during the Agent OVA customization step. Click Connect.

Add Provider Authentication

The first step when onboarding the agent is to authenticate with the Provider. To do this, supply the FQDN/IP address of the provider along with the Provider Admin details.

image-20230323151501-3

If a thumbprint pops up, click Continue to accept it, and copy the certificate to the agent.

Once authentication succeeds, you can select the Provider Organization (there is only one) and then select whether you are creating a new Environment, or trying to recover an existing one. We are creating a new one in this PoC, so that is what is selected. Now click continue.

 

vCenter and Namespace Configuration

 

Now you add the vCenter / vSphere details.

In the requirements section, it was mentioned that two SSO users needed to be configured in vCenter. One user is used for management, whilst the other is used for monitoring.

768x387

If a thumbprint pops up, click Continue to accept it, and copy the certificate to the agent.

512x296

After adding the management user, you will be prompted to add the SSO user with the read-only credentials.

768x153

Add Environment Details - Placement Configuration, Datastores and Network Configuration

In step 3, you are prompted to select the datacenter, cluster, resource pool (if any - this is optional) and the virtual machine folder. You are also prompted to select which datastores and which networks to use for the application and control plane. Note that if the SSO users have not been configured correctly, you will receive errors when trying to select the various vSphere objects when the 'Connect' button is clicked on. Also note that this configuration has selected a vSphere Resource Pool called Silver-RP. This is optional. If no Resource Pool is chosen, all of the vSphere resources in the cluster are made available in the Environment. (The Resource Pools need to be created on the vSphere infrastructure in advance of onboarding the agent and creating the environment).

The datastore and network configuration section allows us to select multiple datastores and application networks which can then be selected as Namespaces are built in the next step. However, for the purposes of the proof-of-concept, we are only choosing a single datastore and a single application network.

768x560

Add S3 bucket for Database Template Storage

Step 4 involves choosing an S3 bucket for the agent's copy of the database templates. Note that this bucket must not exist - this step will automatically create the bucket. Obviously the endpoint will need to be modified to suit your specific S3 object store settings. the template storage name and the bucket name do not need to match, but for simplicity and future reference, it is easier to give them the same name. Note also that the endpoint URL is using https - as of DSM v1.4, TLS is a requirement for S3 Object Stores. Each agent will have its own S3 database template bucket. Click Connect to validate that the S3 bucket configuration is correct.

768x407

Summary

You are now presented with an Agent Environment Summary. If everything looks correct, you can proceed with clicking on the Save button located at the bottom-right of the page. This will restart the Agent appliance, and will require you to log back into the agent. It should now show that on-boarding was successful, with all RabbitMQ shovels up and running. A RabbitMQ shovel is a core plugin that moves messages from a source to a destination.

512x400

Provider Deployment Part 2

If the Agent deployment has been successful, we can proceed with completing the Provider Configuration, namely verifying that the Environment is now visible and that a Namespace can be created. Log back into the Provider portal as the Portal Admin.

Check Environment onboarding

The agent has now been deployed, so the environments should appear in the DSM UI environments view. Navigate to Environments, and in the lower part of the window, Onboarded Environments should be visible. Click on the environment to see more details. The environment which we have created currently has 0 namespaces and 0 databases, and also highlights the vCenter Server, the vSphere Cluster and the Resource Pool (if any) that are used to back the environment's resources.

768x263

Create Namespace

Now that the Environment has been onboarded and discovered, we can proceed with building a namespace. Building a Namespace allows us to select a subset of the vSphere resources that were selected during the creation of the Environment. However, since this proof-of-concept only chose a single datastore and a single application network during the agent onboarding process, we will be using the same resources when we build the namespace. However, it is possible to build multiple namespaces using the same environment, selecting different datastores and different application networks for each namespace.

Navigate to the Namespaces and click on the Create Namespace button.

Name and Description

Provide a name and optional description for the namespace.

768x561

Choose Environment

Select the environment on which to create the namespace.

768x563

Choose the database backup storages / S3 buckets

These are the S3 buckets that were created during the Provider setup, and point to our local, on-prem MinIO S3 object store running in an Ubuntu VM. As part of the PoC, both of these buckets were placed on local storage, even though the product calls for a cloud based S3 bucket as well. Select a different bucket for Local and Cloud, as shown below.

768x563

Select the datastore

This is the datastore on which the database VMs are provisioned. Since there is only one datastore chosen as part of creating the environment for the PoC, then only one datastore is available for selection here too.

768x561

Select the application network

This is the network on which the database VMs are connected. Since there is only one application network chosen as part of creating the environment for the PoC, then only one network is available for selection here too.

768x562

Associate Organizations (optional)

The final step is an optional step, and allows you to associate existing organizations with the namespace. This is how you control who can provision databases on these vSphere resources. There should be at least 2 organizations available; the original Provider Org created with the Provider deployment, and the Tenant Org that we created as part of the initial provider setup. Both of these can be selected, which will allow both the Provider Admin and the Tenant Org Admin to provision databases. Once the Orgs have been selected, click on the Create button.

Check Namespace Basic Information

Once the namespace is created, it becomes visible in the DSM UI. Click on the name of the namespace to see more details about it.

768x331

All necessary infrastructure is now in place. We can turn our attention to database deployments.

Deploy Databases

 

If the both the Provider and Agent deployment has been successful, we can envision a starting infrastructure as something similar to the following.

The infrastructure is now in place to allow us to proceed with the deployment of a standalone database. At present, there are 2 fully supported databases and one early access support. These are as follows:

​Open-source relational database management system (RDBMS)

​Support for v10.x – v13.x

​Templates available from VMware

​Ships with ~55 PostgreSQL extensions which developers can install as needed

​Open-source relational database management system (RDBMS)

​Support for v8.0.30

​Templates available from VMware

Relational database management system (RDBMS) developed by Microsoft

​Early Access Model for DSM– build your own template

​SQL Server container images available on the Microsoft Container Registry (MCR)

​Support for Developer, Standard and Enterprise Editions

Let's proceed with the deployment of some standalone PostgreSQL and MySQL databases. To do this, we will switch contexts. We will logout as the Provider Admin, and login as the Org Admin (Marc) which we created during the initial Provider Configuration.

Switch Context to Org Admin

This is the first login as user marc@corlab.ie. Therefore the password will need to be changed from the one which was provided by the Provider Admin during the initial configuration.

512x380

After logging in with the original password, you will be prompted to change it.

512x390

After successfully changing the password, you will be prompted to login once more. This time you should be taken to the Org Admin landing page.

950x312

Notice that there are some significant differences compared to the Provider Admin landing page. Menu items which are Provider Admin specific are no longer visible. These include VM Plans, Settings, Update Manager and Data Tamplates. Also note that in the top right of the UI, the Organization displays the name of the organization (newdevs in this case) for which marc@cordevs.ie is the Org Admin.

Deploy Standalone PostgreSQL Database

Now that we have successfully logged in as an Org Admin, we can proceed with the creation of a database. With the deployment of a standalone database, we are aiming for a configuration which looks something like the following.

950x469

Let's begin with the creation of a PostgreSQL database, and then follow up with the creation of a MySQL database.

Database Configuration

To publish a standalone PostgreSQL database, navigate to Databases, and click on the "Create DB" button. This will open the Create Database wizard. The namespace should be automatically set to "my-poc-namespace" as this is the only namespace that user marc is the org admin for. In the Database Engine section, select PostgreSQL. Note that this automatically populates a number of field in the wizard such as VM Name, Database Name, etc. From the Database Version drop down list, the four published PostgreSQL databases should be listed. We published all four of these database templates back when we configured the Provider earlier on in the PoC. Select 13.9.0 as this is the latest version supported in DSM v1.4. You may also wish to change the VM Name to something simpler. I changed my VM Name to pg-1 (the name needs to be a minimum of 4 characters).

950x675

At this point, it is possible to click on the button to 'Create DB with Default Configuration'. However, if you click 'Next', you can fine-tune the configuration. Click 'Next' to review the other configuration options.

Resource Configuration

Under Resource Configuration, you can choose the VM Plan that were created as part of the Provider Configuration. Back when the Organization was created, we specified that the VM configuration should be set to plans mode, and not free mode. Thus, when deploying databases in this namespace, it is only possible to select one of the existing VM plans for VM resources. It is not possible to deviate from the plans. In this example, I will select the 'Silver' plan with 4 vCPUs and 16GB memory and from the available plans. Click Next to proceed once the plan is chosen.

950x480

Management Configuration

In the Management Configuration section, it is possible to fine-tune the following:

  1. Monitoring Type - Normal or Enhanced (default)
  2. Backup Configuration, including if backups are automated (they are by default), and what the backup window is
  3. Maintenance Configuration, including if Auto Minor Version Updates is enabled (it is by default), and the maintenance window which defines when the databases can do these minor version updates
  4. DB Cluster Settings, which allows the database to be deployed in a highly available / clustered mode. We will skip this, and show how to do it after the standalone database has been deployed later in the PoC guide.

950x694

Alert Configuration

Nothing to add here as part of the PoC. Click Next.

950x653

TLS Configuration / DB Options Configuration

Nothing to add here as part of the PoC. However, possibly of interest to the customer if you want to point out that there can be secure communication to the databases. Click Next.

950x456

Summary

Review the Summary. If everything is configured correctly, click on the 'Create Database' button located at the bottom of the Summary page.

950x418

Monitor Standalone PostgreSQL Database Creation

Immediately after clicking on the Create Database button, you will be placed back in the Databases view. The new database should be visible, and will have a state of Init.

950x153

Click on the VM Name to see the database details. Make a note of the Username and Password as these will be used shortly. Note also that there is a DB FQDN of primary.<db-name>.<db fqdn suffix>. This identifies the primary when clustering is configured and there are a number of replicas created. The <db fqdn suffix> was defined when this tenant organization was created. Although this is now the fully qualified domain name of the database, is not necessary to concern ourselves with it for now. We will revisit it when we configure clustering on the database.

Note that there are 2 IP addresses associated with the database. One is plumbed onto the control plane network (DB Mgmt UP Address) whilst the other is plumbed onto the application network (DB IP Address). Both address are provided via DHCP as described in the prerequisites section at the beginning of the PoC guide. Also note that the advanced parameters are displayed, and that TLS is disabled. These match the selection made during the deployment.

950x541

Click on the Operations tab to see the progress of the database deployment.

950x169

Click on the Create DB in the Operation Type to get even more detail regarding the deployment progress:

512x538

The sequence of events that take place during the deployment of a database are as follows:

950x366

Note that the final step is to protect and backup the database. This is because we chose to have automatic backups during the creation of the database. When the above tasks have completed, there should be a total of 3 operations showing success for the database.

950x194

The database is now online.

Verify access to Standalone PostgreSQL Database

If the customer does not have access to tooling to verify access to the database, consider using pgAdmin4 - https://www.pgadmin.org/. This will enable you to connect to the recently provisioned PostgreSQL database and confirm functionality. All connection information is available on the database details view. To connect to your db, right click on the 'Servers' and select Register in the pgAdmin UI. In the General tab, add a name for the database. In the Connection tab, provide the hostname or IP address of the database, change username from default of postgres to dbaas and add the password. Make sure that the IP Address that you choose is the one from the application network and not the control plane network, i.e., DB IP Address and not DB Mgmt IP Address.

Save the registration details, and you should be connected to the database.

256x334

You have now verified that the standalone PostgreSQL database has deployed successfully.

Some final Standalone PostgreSQL Database Tasks

As part of the PoC, you can now show the following additional information about the Standalone PostgreSQL database.

  1. Monitoring - show the 12 different health checks that we run on the database
  2. Backup - show that a full backup as already been taken, and that the backup policy/schedule is in place to run daily at 01:00am
  3. Maintenance and Updates - show that Auto Minor Version Upgrade is enabled. Also show the maintenance policy window which is configured to run over Saturday night / Sunday morning. If a new minor version of the database becomes available, it will be upgraded on Saturday night / Sunday morning. Current versions are also displayed.
  4. VM Settings - show that they match the VM Plan. Under Actions the VM Plan for Compute & Memory can be modified. The VMDK (virtual disk) of the VM can also be extended. Note that both these operations are disruptive and required the VM/database to be rebooted.
  5. Cluster Settings - this is used to set up a highly available, clustered database. This will be revisited shortly.
  6. Logs - show how to generate a log bundle for the database. Note that the log bundle is stored on an S3 bucket, so when you go to retrieve them, you will be redirected to the S3 Object Store where they are hosted.
  7. Clone - clone the database. This is a common workflow for creating test coppies.

Deploy Standalone MySQL Database

Let's create a new standalone database, but this time we shall create a MySQL database. Note that there is only a single database template for MySQL in DSM v1.4.

Database Configuration

Return to the databases view, whilst continuing to be logged in as the Org Admin (marc). Click on the 'Create DB' button. The namespace value is auto-populated as there is only one namespace with this organization associated. This time, select the MySQL Database Engine. The Database Version should get set to 8.0.30 in DSM version 1.4 as this is the only database template we have for MySQL. Note that the VM Name is once again rather complex. You may like to simplify it. The rest of the configuration settings may be left at the default. This time, you can speed up the creation process by clicking on the 'Create DB with Default Configuration' button.

950x555

Now everything should behave exactly as before. You can use the Operations tab to once more track the progress of the database deployment, just like we did for the PostgreSQL database deployment.

Verify access to Standalone MySQL Database

A useful tool for accessing MySQL databases is the MySQL Shell - mysqlsh - https://dev.mysql.com/downloads/shell/. There is extensive documentation for the shell, such as https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-shell-commands.html. Here is an example of connecting to a MySQL database that has been provisioned via DSM. Once more, be sure to connect to the database interface on the application network and not the interface on the control plane network.

chogan@choganPJHD2 ~ % mysqlsh --uri mysql://dbaas@<IP Address>:3306
Creating a Classic session to 'dbaas@<IP Address>:3306'
Please provide the password for 'dbaas@<IP Address>:3306': *********
Save password for 'dbaas@<IP Address>:3306'? [Y]es/[N]o/Ne[v]er (default No): Y
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 4758
Server version: 8.0.28-19 Percona Server (GPL), Release 19, Revision 31e88966cd3
No default schema selected; type \use <schema> to set one.
MySQL Shell 8.0.12

 

Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.

 

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

 

Type '\help' or '\?' for help; '\quit' to exit.

 

MySQL ssl  JS > \sql
Switching to SQL mode... Commands end with ;

 

MySQL ssl  SQL > show databases;
+--------------------+
| Database           |
+--------------------+
| dbaas              |
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
5 rows in set (0.0508 sec)

MS SQL Server (Early Access)

Deploying an MS SQL Server database is outside the scope of this PoC Guide. However the instructions to create a template using an MS SQL Server image are available here: https://cormachogan.com/2022/12/15/vmware-data-services-manager-sql-server-database-template/

There is currently no cluster support for MS SQL Server in VMware Data Services Manager

Clustered Database Deploy

We shall now look at the final part of the PoC Guide, and that is the deployment of a clustered database for high availability. If we take PostgreSQL, we are looking for a deployment that will be similar to what is shown in the next diagram. To satisfy clustering requirements, 2 Read Replicas should be added to the database.

950x474

Conditional Forwarder on Windows Server DNS

Additional DNS requirements exist for clustered databases. The various VMs that make up the clustered database have a requirement to resolve the FQDN of the other VMs. There is also a primary.<db-name>.<dns suffix> that needs to be resolved as this is what represents the active primary replica in the database. Now, all of the fqdn entries for the databases are updated automatically in Provider’s /etc/hosts. The Provider also runs a resolver so one of the easiest ways to integrate DSM is to make the customer's central DNS to forward FQDN queries for the database domain (in our case this is newdevs.internal) to the Provider VM. This DNS resolution needs to occur over the application network.

For example, if I take my lab DNS server which is running on a Windows Server, I can simply add a conditional forward so that any requests for the newdevs.internal domain are sent on to my Provider VM, which can resolve them.

From the DNS config, select Conditional Forwarder > New Conditional Forwarder:

512x297

Add the DNS domain, in this case newdevs.internal. Add the IP address of the Provider VM as the master server for this domain. You may get a warning about SOA (Statement of Authority).

512x423

Once the conditional forwarder has been added, we can test it to check if it is working.

512x487

We can use nslookup to (a) verify that the Provider is resolving the necessary IP addresses of our databases, and (b) that the central DNS is now conditionally forwarding the lookup requests to the Provider. Below is a test on the standalone PostgreSQL database. We can see that both names resolve to the same IP address.

chogan@choganPJHD2 ~ % nslookup pg-1.newdevs.internal 10.27.51.135
Server: 10.27.51.135
Address: 10.27.51.135#53

 

Name: pg-1.newdevs.internal
Address: 10.27.62.41

 

chogan@choganPJHD2 ~ % nslookup pg-1.newdevs.internal 10.27.51.252
Server: 10.27.51.252
Address: 10.27.51.252#53

 

Non-authoritative answer:
Name: pg-1.newdevs.internal
Address: 10.27.62.41

 

chogan@choganPJHD2 ~ %

The same is true for the MySQL standalone database. In fact, if we do a lookup against the primary DNS name associated with this database, it resolves to the same IP Address as the database name itself.

chogan@choganPJHD2 ~ % nslookup mysql8-1.newdevs.internal 10.27.51.252
Server: 10.27.51.252
Address: 10.27.51.252#53

 

Non-authoritative answer:
Name: mysql8-1.newdevs.internal
Address: 10.27.62.48

 

chogan@choganPJHD2 ~ % nslookup primary.mysql8-1.newdevs.internal 10.27.51.252
Server: 10.27.51.252
Address: 10.27.51.252#53

 

Non-authoritative answer:
Name: primary.mysql8-1.newdevs.internal
Address: 10.27.62.48

Everything appears to be working as expected for DNS resolution. We can now proceed with enabling clustering on the standalone databases.

Warning: It is important to make note of which user is creating the databases, and which Org they are part of. Different Orgs have different DNS suffixes, so it is possible that you might try to create a database with a user with the incorrect DNS suffix. This can impact DNS lookups if conditional forwarders are used to point back to the Provider DNS server based on domain names.

Clustered Database Deploy - PostgreSQL

The steps to build a clustered database are very simple. Navigate to the database that you wish to make highly available, and select Cluster Settings.

950x387

Click on the '+Create' Button to add a Read Replica. The namespace value is auto-populated as there is only one namespace with this organization associated.

768x404

Provide a Name for the Replica VM. Click on Create, and this will begin the creation of two new VMs. One is the PG_Monitor. The purpose of the PG_Monitor VM is to host a service called pg-auto-failover. pg_auto_failover is an extension and service for PostgreSQL that monitors and manages automated failover of a PostgreSQL cluster. This service requires a monitor to run on.  In some respects, the PG_Monitor could be considered as a sort of witness node. PG_Monitor observers the state of the database nodes. It co-ordinates the cluster and initiates fail-overs when appropriate.

Note that the Operations for the deployment of the PG_Monitor should be observed via the PG_Monitor database, but the Operations for the deployment of the Replicas should be monitored from the primary database, in this case pg-1.

950x389

Once the PG_Monitor VM comes online, the first Read Replica will start to deploy. A successful deployment of the PG_Monitor also confirms that the DNS forwarder is also working successfully for name resolution.

950x386

After a successful deployment of the Replica, the Replication Status should become Active within a few minutes. However note that the HA Status is still Incomplete. This is because another replica is required.

950x386

To create another replica, click on +Create once more, and create another Read Replica.

768x401

Since the PG_Monitor already exist, this step should be much quicker this time around. Again, to monitor the deployment of the replica, look at the operations on the Primary.

950x380

And after a few minutes, the Replication Status of both Replicas should show Active, anf the HA Status should become Complete.

950x386

The PostgreSQL database is now clustered and is highly available.

Verify access to Clustered PostgreSQL Database

We can doa very simple test against the primary FQDN, and make sure that it does indeed connect to the primary database. To begin, we can use nslookup to report back on the IP address of the 'instance/replica' which has the primary role. To start, it reports back the IP address of pg-1.

chogan@choganPJHD2 ~ % nslookup pg-1-replica-2.newdevs.internal 10.27.51.252
Server: 10.27.51.252
Address: 10.27.51.252#53

 

Non-authoritative answer:
Name: pg-1-replica-2.newdevs.internal
Address: 10.27.62.57

 

chogan@choganPJHD2 ~ % nslookup pg-1.newdevs.internal 10.27.51.252
Server: 10.27.51.252
Address: 10.27.51.252#53

 

Non-authoritative answer:
Name: pg-1.newdevs.internal
Address: 10.27.62.41

 

chogan@choganPJHD2 ~ % nslookup primary.pg-1.newdevs.internal 10.27.51.252
Server: 10.27.51.252
Address: 10.27.51.252#53

 

Non-authoritative answer:
Name: primary.pg-1.newdevs.internal
Address: 10.27.62.41

 

chogan@choganPJHD2 ~ %

As can be observed above, the primary.pg-1 IP address is the same as the pg-1 IP address. Thus any connection to the primary FQDN will land on the pg-1 database. Let's now promote one of the read replicas to the primary role and see if the primary FQDN gets associated with it. Promotion of replicas can easily be done from the DSM UI. Navigate to the Database > Cluster Settings. Then, click on the three vertical buttons in the Action column of the Read Replica you wish to promote. This will display an option to Promote Replica. Select this option.

950x384

The Read Replica status will change to Modifying.

950x384

Note that you may observe a critical status against the replica that has been promoted until a full backup has been taken of the replica. The reason for the critical status is due to the database bin log health status.

950x385

Once the database replica has been protected and backed up, which has been configured in this PoC to happen automatically, the critical status should clear after ~ 5 minutes.

950x396

If we do another nslookup check now, the primary FQDN should now resolve to the IP address of the read replica that has just been promoted to primary.

chogan@choganPJHD2 ~ % nslookup primary.pg-1.newdevs.internal 10.27.51.252
Server: 10.27.51.252
Address: 10.27.51.252#53

 

Non-authoritative answer:
Name: primary.pg-1.newdevs.internal
Address: 10.27.62.56

 

chogan@choganPJHD2 ~ % nslookup pg-1-replica.newdevs.internal 10.27.51.252
Server: 10.27.51.252
Address: 10.27.51.252#53

 

Non-authoritative answer:
Name: pg-1-replica.newdevs.internal
Address: 10.27.62.56

 

chogan@choganPJHD2 ~ %

Success! This is working as expected as we have configured our central DNS to send requests for the newdevs.internal domain to the Provider VM for resolution. The Provider VM is maintaining this updated list in its /etc/hosts.

Clustered Database Deploy - MySQL

The steps to configure a highly available MySQL database are much the same as PostgreSQL. However, the implementation is different. Whereas PostgreSQL relies on the PG_Monitor to deal with fail-overs, etc. MySQL uses a virtual, front-end IP address for the database. This has to be a static IP address as pointed out in the requirements section of the PoC guide. Thus, clients connect to his virtual IP address, and fail-overs and promotions can take place on the cluster, while clients remain connected to the front-end IP address. The static IP needs to on the Application Network.

We will use the mysql8-1 standalone database that was created earlier. Select the database and navigate to the Cluster Settings view. From there, just like we did for the PostgreSQL database, click on the '+Create' button under the Replication section. Note the difference - this time you are requested to add a front-end, virtual IP address for the cluster. The namespace value is auto-populated as there is only one namespace with this organization associated.

768x522

After providing the replica a name and providing a Cluster IP, and ensuring that the IP address is valid for the Application Network, click on the 'Create' button.

768x520

This will start the creation of the first read replica. Note that in the case of MySQL, no additional VMs such as the PG_Monitor, are required.

As before, monitor the Read Replica deployment from the primary database (mysql8-1) Operations view, not from the Replica's Operations view.

Once the clone task completes, within a few minutes, the Replica should show Online and Active, but the HA Status is still Incomplete.

950x285

To bring the HA Status to complete, we have to add another replica. Click on '+Create' once more, and provide the name for the new replica. Note that the Cluster IP is now populated and cannot be changed.

768x519

After a few minutes, the new replica is cloned and configured. Its status changes to Online and the replication status becomes Active. At this point, the HA Status shows as Complete.

950x390

Everything looks good. The MySQL database is configured as a cluster. Note that there is no reliance on the primary FQDN for this database. Instead clients should use the Cluster IP.

Client should not connect to the default port of 3306 either. Instead, the should use the following ports with the Cluster IP:

  • Port 6446 - Read-Write Connection
  • Port 6447 - Read-Only Connection

The MySQL router component of MYSQL Server should then route the client connection to the correct VM. However, there is a known issue currently.

Verify access to Clustered MySQL Database - Warning!

It is possible that the Cluster IP address becomes associated with a Read Replica and not the Read-Write primary. This is expected. The guidance is that we should not use the default port of 3306 when connecting to the Cluster IP of a clustered MySQL database as this will connect a client to the local instance of the VM that has the Cluster IP. Thus, a client could end up connecting to a read-only replica.

With a clustered MySQL database, the intention is that a client should specify port 6446 if they want a read-write connection to the database, or specify port 6447 if they want a read-only connection.

The MySQL router component of MYSQL Server should then route the client connection to the correct VM, i.e. the read-write instance.

Caution: There is currently a bug due to a change in the MySQL Router defaults. Connections to port 6446 may result in the following error: 

MySQL Error 2026 (HY000): SSL connection error: error:00000005:lib(0):func(0):DH libchogan@choganPJHD2 ~

This is issue is not yet fixed in MySQL 8.0.28.1 or in 8.0.30.1.

MySQL and Postgres Cluster in vCenter Server

Let's look at the database and how it appears in vSphere. At this point, there are 2 clustered databases deployed, one for PostgreSQL and another for MySQL. The appear in the vCenter inventory as virtual machines.

371x215

DRS Anti-affinity Rules

Note that they are placed in a Resource Pool. This is because when we on-boarded the agent, we selected a Resource Pool for this environment. The other interesting thing to note is the integration with VM Affinity/Anti-affinity Rules which places the different VMs of the database on different ESXi hosts. This means that if there is an ESXi host failure, only one VM is impacted, and the database remains accessible. There are two sets of rules, one for the PostgreSQL database and the other for the MySQL database.

950x480

 

About the Author

About the author

I hate it when you find a document but there is no reference to the author. So here it is:

Author: This document was intially drafted by Cormac Hogan.

It is now maintained here on core.vmware.com by John Nicholson.

Filter Tags

Document Operational Tutorial Proof of Concept Quick-Start Intermediate Advanced Design Planning