vSphere DataSets

What are vSphere DataSets?

vSphere DataSets provide an easy method distribute small, infrequently changing data between the vSphere management layer and a guest operating system running in a virtual machine with VMware Tools installed.

The following diagram illustrates the functions the vSphere API and VMware Tools interfaces can perform with regards to DataSets. Multiple DataSets can be created for a single virtual machine, and each DataSet can store multiple key-value entries.

DataSets API Relation

vSphere DataSets files are stored with the Virtual Machine files and a reference to the DataSet file (.dsd) is written into the virtual machine configuration file (.vmx).

For example:

dataSetsMgr.diskStoreFile = "ubuntu-server-01.dsd"

 You can see the DataSet files by browsing the virtual machine home directory.

DataSet file location

 

Requirements

DataSets requires vSphere 8.0. This means that both vCenter Server and ESXi must be running version 8.0 or later and the virtual machine hardware version of the virtual machines must be version 20 or later.

VMware Tools is required to be installed in the guest operating system to allow the guest OS to access DataSets. DataSets can be created for a virtual machine, before the guest operating system or VMware Tools are installed. A guest OS cannot create or delete a DataSet. Those operations must be performed using vSphere API. VMware recommends the latest available version of VMware Tools.

The vSphere DataSets API requires vSphere DataSet privileges on the virtual machine objects. In the guest OS, the user requires administrator / root level permissions to access the vSphere DataSets.

Why DataSets?

Better Security. DataSets use the secure vSphere API authentication and can be configured for guest and host access.

Scalability. Multiple DataSets can be configured per virtual machine. Each VM can support up to 100MB of data across all of its DataSets.

Data Persistence. The data can be persisted across reboots of a virtual machine. DataSets and DataSet entries are part of a VM object and will travel with a VM if it is migrated or cloned even across vCenter instances. DataSets store Dataset entries as key-value pairs. DataSets can be interfaced with, using vSphere APIs, even when a VM is powered off.

Easy to Use. DataSets are created from the vSphere management layer using vSphere API. DataSet entries are created or modified using vSphere API or by using VMware Tools commands within the guest OS.

Use Cases for DataSets

Here are some use cases for DataSets.

Guest deployment status. An Administrator wants to perform guest provisioning that includes running deployment scripts. DataSets can present configuration data and deployment scripts to the guest. The guest can then report back status including possible errors.

Guest agent configuration. A management application needs to configure an in-guest agent. The management side creates and populates the DataSet with configuration data, and the guest agent reads and responds to that data.

Guest inventory management. A list of installed applications can be published as a DataSet and read by the vSphere management layer.

DataSets vSphere API

DataSets and DataSet entries are created using vSphere REST APIs. You can use your preferred API interface. In the following examples we are using the Postman application to make API calls to vSphere.


Note: In the following REST API examples, {{vc}} represents the vCenter Server, {{vm-id}} represents the virtual machine managed object ID (MOID), and {{ds-name}} represents the DataSet name.


 

Create a DataSet

Use the create DataSet API to create a new DataSet on a virtual machine. This POST API takes in a few parameters in the body of the API request.

{
“name”: “[string]”,
“description”: “[string]”,
“host”: “[READ_WRITE | READ_ONLY | NONE]”,
“guest”: “[READ_WRITE | READ_ONLY | NONE]”
}

For example, we create a new DataSet called dataset-1, give the ESXi hosts read-write permissions and give the guest OS read-only permissions.

POST https://{{vc}}/api/vcenter/vm/{{vm-id}}/data-sets/
{
    "name": "dataset-1",
    "description": "Sample dataset for demo.",
    "host": "READ_WRITE",
    "guest": "READ_ONLY"
}

Postman Create DataSet


A vSphere DataSet created like above, will have the DataSets copied during a clone operation. You can configure a DataSet to be excluded from a clone operation using the omit_from_snapshot_and_clone parameter and a value of true.

"omit_from_snapshot_and_clone": true

The resulting clone will not have the DataSet of the source VM. Also, DataSets created using the omit_from_snapshot_and_clone parameter and a value of true means that DataSet is not persisted during a revert to snapshot operation. All DataSets will be deleted when the VM reverts to a snapshot.


 

List DataSets

Use the list DataSet API to list all DataSets for a given virtual machine. The name and description of all DataSets associated with the given VM are returned.

GET https://{{vc}}/api/vcenter/vm/{{vm-id}}/data-sets/

Postman List DataSets

To see more information about a specific DataSet, simply append the DataSet name to the end of the same API request. You can see the host and guest permissions on the DataSet and the size of the DataSet in bytes.

GET https://{{vc}}/api/vcenter/vm/{{vm-id}}/data-sets/{{ds-name}}

Query DataSet

 

Set a DataSet Entry

Use the set DataSet entry API to create a key-value entry for a given DataSet. This PUT API takes in the desired name of the entry in the URI and the desired value in the body of the API request. For example, we set a new DataSet entry called key-1 with a value of value for key-1.

PUT https://{{vc}}/api/vcenter/vm/{{vm-id}}/data-sets/{{ds-name}}/entries/key-1 

Postman Create Entry

 

Get Entries in a DataSet

Use the get entries API to list all the entries of a given DataSet. This GET API takes in the DataSet name in the URI. For example, we list the entries in the DataSet and get three entries returned.

GET https://{{vc}}/api/vcenter/vm/{{vm-id}}/data-sets/{{ds-name}}/entries

Postman List Entires

 

Get Values of an Entry

Use the get entry value API to get the value of a given DataSet entry. This GET API takes in the DataSet name and the entry name in the URI.

GET https://{{vc}}/api/vcenter/vm/{{vm-id}}/data-sets/{{ds-name}}/entries/{{entry-name}}

Postman List Values

 

Delete a DataSet

Use the delete DataSet API to delete a specified DataSet. This DELETE API takes in the DataSet name in the URI.

DELETE https://{{vc}}/api/vcenter/vm/{{vm-id}}/data-sets/{{ds-name}}/

 


Note: A DataSet cannot be deleted if it contains entries. The DataSet must be empty before it can be deleted. For example, if we try to delete a DataSet that contains entries, the API will return the error RESOURCE_IN_USE. See the section Delete Entries below.

{
    "error_type": "RESOURCE_IN_USE",
    "messages": [
        {
            "args": [],
            "default_message": "Data set 'dataset-1' in Virtual Machine 'vm-1024:bbef5c51-323c-4c28-8c43-0b6fc3a785f5' is not empty.",
            "localized": "Data set 'dataset-1' in Virtual Machine 'vm-1024:bbef5c51-323c-4c28-8c43-0b6fc3a785f5' is not empty.",
            "id": "com.vmware.api.vcenter.vm.data_sets.dataset_not_empty",
            "params": {
                "dataset": {
                    "s": "dataset-1"
                },
                "vm": {
                    "s": "vm-1024:bbef5c51-323c-4c28-8c43-0b6fc3a785f5"
                }
            }
        }
    ]
}

You can use the force flag to forcibly delete a DataSet without first deleting the contained entries.

DELETE https://{{vc}}/api/vcenter/vm/{{vm-id}}/data-sets/{{ds-name}}/?force=true

 

Delete Entries

Use the delete entries API to delete a specified entry in a specified DataSet. This DELETE API takes in the DataSet name and the entry name in the URI.

https://{{vc}}/api/vcenter/vm/{{vm-id}}/data-sets/{{ds-name}}/entries/{{entry-name}}

 

DataSets vmtoolsd Commands

Once a DataSet is present in the VM, the guest OS can access and modify its DataSet entries using a small set of VMware Tools commands.


Note: In the following vmtoolsd command examples we pipe the command to jq for better readability.


 

List DataSets

Use the command vmtoolsd --cmd datasets-list to list all DataSets for the virtual machine.

vmtoolsd --cmd 'datasets-list'

vmtoolsd list datasets

Use the command vmtoolsd --cmd datasets-query to query the DataSet and view host and guest permissions and see the size of the DataSet in bytes.

vmtoolsd --cmd 'datasets-query {"dataset": "[dataset-name]"}' 

vmtoolsd query dataset

List Entries in a DataSet

Use the command vmtoolsd --cmd datasets-list-keys to list the entries in a given DataSet.

vmtoolsd --cmd 'datasets-list-keys {"dataset": "[dataset-name]"}'

vmtoolsd list entries

 

List Values of Entries

Use the command vmtoolsd --cmd datasets-get-entry to list the values of specified entries in a given DataSet.

vmtoolsd --cmd 'datasets-get-entry {"keys": ["[entry-name]","[entry-name]","[...]"], "dataset": "[dataset-name]"}'

vmtoolsd list values

 

Set a DataSet Entry

Use the command vmtoolsd --cmd datasets-set-entry to update an entry with a new value. If the entry doesn't already exist it will be created. For example, take a current DataSet that contains two entries and each entry a specific value.

vmtoolsd modify 1

Using the vmtoolsd --cmd datasets-set-entry command we update the values of the existing entries and also create a third new entry and value pair.

vmtoolsd --cmd 'datasets-set-entry { "dataset" : "[dataset-name]", "entries": [{"key": "[entry-name]", "value": "[value]" } ] }'

vmtoolsd modify entry


Note: As you can see, the command length starts to get very long. This would be a good reason to use the JSON in a File method in the section below.


Inspecting the same DataSet again we can see that existing entry values have been updated and a new entry has also been created.

vmtoolsd modify entry 3

From the perspective of vSphere API we can see that the modifications made from the guest OS side are seen from the vSphere management side.

postman modify entry 1

Appending Entry Values

When you set a DataSet entry with a certain value, it will overwrite the current value. If you want to append additional information to an existing value you can use the "append": true parameter.

For example, if we look at the above output for new-entry-3, it has a value of "new value 3". If we want to add additional data to the entry we would use a syntax like the following:

vmtoolsd --cmd 'datasets-set-entry { "dataset" : "dataset-2", "entries": [{"key": "new-entry-3", "value": ". We added more data", "append": true } ] }'

Now if we were to list that entry, it would show the appended value.

vmtoolsd --cmd 'datasets-get-entry {"keys": ["entry-1","entry-2","new-entry-3"], "dataset": "dataset-2"}' | jq
{
  "result": true,
  "entries": [
    {
      "entry-1": "new value 1"
    },
    {
      "entry-2": "new value 2"
    },
    {
      "new-entry-3": "new value 3. We added more data"
    }
  ]
}

 

Delete Entries

Use the command vmtoolsd --cmd datasets-delete-entry to delete the specified entries in a specified DataSet. You can specify one or more entries in the same deletion command.

vmtoolsd --cmd 'datasets-delete-entry {"keys": ["[entry-name]","[entry-name]","[...]" ], "dataset": "[dataset-name]"}'

delete entry vmtoolsd

 

JSON in a File

VMware Tools 11.2 added the --cmdfile option to the vmtoolsd command. This option allows directive to be specified in a file rather than on the command line, as in previous examples. Due to the 8KB limit for shell commands on Windows, and the vagaries of escaping shell syntax on Linux, the command file option is recommended.

In this example, the mycmd.json file contains the datasets-get-entry command and we pass the file to the vmtoolsd process.

JSON in a file

 

Frequently Asked Questions

Are vSphere DataSets encrypted by VM Encryption?

vSphere DataSet files are encrypted by VM Encryption. However, it is not advised to store sensitive data within vSphere DataSets.

What permissions are needed by Guest OS users to access vSphere DataSets? 

Guest OS users require administrator or root level permissions to read or write to vSphere DataSets.

Are vSphere DataSets included in VM backup and restore tasks?

VMware is working with backup solution vendors to support backup and restore of vSphere DataSets.

Are vSphere DataSets stored in the vCenter database?

No, vSphere DataSets are stored as .dsd files with the virtual machine files.

 

Known Issues

vSphere DataSets currently has some known incompatibilities with some vSphere features and other VMware solutions. 

OVF/OVA: vSphere DataSets are not exported/imported when a VM is exported/imported as an OVF/OVA template. 

Content Library: VM templates stored in a Content Library as the OVA/OVF format will not retain any DataSets. Templates stored as the VMTX format will retain their DataSets and VMs deployed from VMTX format templates will contain any DataSets stored with the VMTX template.

Instant Clone: vSphere DataSets are not cloned when cloning a virtual machine using instant clone. 

SRM / HCX / VCDR: vSphere DataSets are not copied when virtual machines are migrated or copied using Site Recovery Manager (SRM), VMware HCX or VMware Cloud Disaster Recovery (VCDR)

References

For more on DataSets, see the DataSets for Guest Management section of the VMware Guest SDK Programming Guide.

Filter Tags

ESXi 8 vCenter Server 8 vSphere 8