vsphere_compute_cluster
A note on the naming of this resource: VMware refers to clusters of hosts in the UI and documentation as clusters, HA clusters, or DRS clusters. All of these refer to the same kind of resource (with the latter two referring to specific features of clustering). In Terraform, we use
vsphere_compute_clusterto differentiate host clusters from datastore clusters, which are clusters of datastores that can be used to distribute load and ensure fault tolerance via distribution of virtual machines. Datastore clusters can also be managed through Terraform, via thevsphere_datastore_clusterresource.
The vsphere_compute_cluster resource can be used to create and manage clusters of hosts allowing for resource control of compute resources, load balancing through DRS, and high availability through vSphere HA.
For more information on vSphere clusters and DRS, see this page. For more information on vSphere HA, see this page.
NOTE: This resource requires vCenter and is not available on direct ESXi connections.
NOTE: vSphere DRS requires a vSphere Enterprise Plus license.
Example Usage
The following example sets up a cluster and enables DRS and vSphere HA with the default settings. The hosts have to exist already in vSphere and should not already be members of clusters - it's best to add these as standalone hosts before adding them to a cluster.
Note that the following example assumes each host has been configured correctly according to the requirements of vSphere HA. For more information, click here.
variable "datacenter" {
default = "dc1"
}
variable "hosts" {
default = [
"esxi1",
"esxi2",
"esxi3",
]
}
data "vsphere_datacenter" "dc" {
name = "${var.datacenter}"
}
data "vsphere_host" "hosts" {
count = "${length(var.hosts)}"
name = "${var.hosts[count.index]}"
datacenter_id = "${data.vsphere_datacenter.dc.id}"
}
resource "vsphere_compute_cluster" "compute_cluster" {
name = "terraform-compute-cluster-test"
datacenter_id = "${data.vsphere_datacenter.dc.id}"
host_system_ids = ["${data.vsphere_host.hosts.*.id}"]
drs_enabled = true
drs_automation_level = "fullyAutomated"
ha_enabled = true
}
Argument Reference
The following arguments are supported:
-
name- (Required) The name of the cluster. -
datacenter_id- (Required) The managed object ID of the datacenter to create the cluster in. Forces a new resource if changed. -
folder- (Optional) The relative path to a folder to put this cluster in. This is a path relative to the datacenter you are deploying the cluster to. Example: for thedc1datacenter, and a providedfolderoffoo/bar, Terraform will place a cluster namedterraform-compute-cluster-testin a host folder located at/dc1/host/foo/bar, with the final inventory path being/dc1/host/foo/bar/terraform-datastore-cluster-test. -
tags- (Optional) The IDs of any tags to attach to this resource. See here for a reference on how to apply tags.
NOTE: Tagging support requires vCenter 6.0 or higher.
-
custom_attributes- (Optional) A map of custom attribute ids to attribute value strings to set for the datastore cluster. See here for a reference on how to set values for custom attributes.
NOTE: Custom attributes are unsupported on direct ESXi connections and require vCenter.
Host management options
The following settings control cluster membership or tune how hosts are managed within the cluster itself by Terraform.
-
host_system_ids- (Optional) The managed object IDs of the hosts to put in the cluster. -
host_cluster_exit_timeout- The timeout for each host maintenance mode operation when removing hosts from a cluster. The value is specified in seconds. Default:3600(1 hour). -
force_evacuate_on_destroy- When destroying the resource, setting this totruewill auto-remove any hosts that are currently a member of the cluster, as if they were removed by taking their entry out ofhost_system_ids(see below). This is an advanced option and should only be used for testing. Default:false.
NOTE: Do not set
force_evacuate_on_destroyin production operation as there are many pitfalls to its use when working with complex cluster configurations. Depending on the virtual machines currently on the cluster, and your DRS and HA settings, the full host evacuation may fail. Instead, incrementally remove hosts from your configuration by adjusting the contents of thehost_system_idsattribute.
How Terraform removes hosts from clusters
One can remove hosts from clusters by adjusting the host_system_ids configuration setting and removing the hosts in question. Hosts are removed sequentially, by placing them in maintenance mode, moving them to the root host folder in vSphere inventory, and then taking the host out of maintenance mode. This process, if successful, preserves the host in vSphere inventory as a standalone host.
Note that whether or not this operation succeeds as intended depends on your DRS and high availability settings. To ensure as much as possible that this operation will succeed, ensure that no HA configuration depends on the host before applying the host removal operation, as host membership operations are processed before configuration is applied. If there are VMs on the host, set your drs_automation_level to fullyAutomated to ensure that DRS can correctly evacuate the host before removal.
Note that all virtual machines are migrated as part of the maintenance mode operation, including ones that are powered off or suspended. Ensure there is enough capacity on your remaining hosts to accommodate the extra load.
DRS automation options
The following options control the settings for DRS on the cluster.
-
drs_enabled- (Optional) Enable DRS for this cluster. Default:false. -
drs_automation_level(Optional) The default automation level for all virtual machines in this cluster. Can be one ofmanual,partiallyAutomated, orfullyAutomated. Default:manual. -
drs_migration_threshold- (Optional) A value between1and5indicating the threshold of imbalance tolerated between hosts. A lower setting will tolerate more imbalance while a higher setting will tolerate less. Default:3. -
drs_enable_vm_overrides- (Optional) Allow individual DRS overrides to be set for virtual machines in the cluster. Default:true. -
drs_enable_predictive_drs- (Optional) Whentrue, enables DRS to use data from vRealize Operations Manager to make proactive DRS recommendations. * -
drs_advanced_options- (Optional) A key/value map that specifies advanced options for DRS and DPM.
DPM options
The following settings control the Distributed Power Management (DPM) settings for the cluster. DPM allows the cluster to manage host capacity on-demand depending on the needs of the cluster, powering on hosts when capacity is needed, and placing hosts in standby when there is excess capacity in the cluster.
-
dpm_enabled- (Optional) Enable DPM support for DRS in this cluster. Requiresdrs_enabledto betruein order to be effective. Default:false. -
dpm_automation_level- (Optional) The automation level for host power operations in this cluster. Can be one ofmanualorautomated. Default:manual. -
dpm_threshold- (Optional) A value between1and5indicating the threshold of load within the cluster that influences host power operations. This affects both power on and power off operations - a lower setting will tolerate more of a surplus/deficit than a higher setting. Default:3.
vSphere HA Options
The following settings control the vSphere HA settings for the cluster.
NOTE: vSphere HA has a number of requirements that should be met to ensure that any configured settings work correctly. For a full list, see the vSphere HA Checklist.
-
ha_enabled- (Optional) Enable vSphere HA for this cluster. Default:false. -
ha_host_monitoring- (Optional) Global setting that controls whether vSphere HA remediates virtual machines on host failure. Can be one ofenabledordisabled. Default:enabled. -
ha_vm_restart_priority- (Optional) The default restart priority for affected virtual machines when vSphere detects a host failure. Can be one oflowest,low,medium,high, orhighest. Default:medium. -
ha_vm_dependency_restart_condition- (Optional) The condition used to determine whether or not virtual machines in a certain restart priority class are online, allowing HA to move on to restarting virtual machines on the next priority. Can be one ofnone,poweredOn,guestHbStatusGreen, orappHbStatusGreen. The default isnone, which means that a virtual machine is considered ready immediately after a host is found to start it on. * -
ha_vm_restart_additional_delay- (Optional) Additional delay in seconds after ready condition is met. A VM is considered ready at this point. Default:0(no delay). * -
ha_vm_restart_timeout- (Optional) The maximum time, in seconds, that vSphere HA will wait for virtual machines in one priority to be ready before proceeding with the next priority. Default:600(10 minutes). * -
ha_host_isolation_response- (Optional) The action to take on virtual machines when a host has detected that it has been isolated from the rest of the cluster. Can be one ofnone,powerOff, orshutdown. Default:none. -
ha_advanced_options- (Optional) A key/value map that specifies advanced options for vSphere HA.
HA Virtual Machine Component Protection settings
The following settings control Virtual Machine Component Protection (VMCP) in vSphere HA. VMCP gives vSphere HA the ability to monitor a host for datastore accessibility failures, and automate recovery for affected virtual machines.
Note on terminology: In VMCP, Permanent Device Loss (PDL), or a failure where access to a specific disk device is not recoverable, is differentiated from an All Paths Down (APD) failure, which is used to denote a transient failure where disk device access may eventually return. Take note of this when tuning these options.
-
ha_vm_component_protection- (Optional) Controls vSphere VM component protection for virtual machines in this cluster. Can be one ofenabledordisabled. Default:enabled. * -
ha_datastore_pdl_response- (Optional) Controls the action to take on virtual machines when the cluster has detected a permanent device loss to a relevant datastore. Can be one ofdisabled,warning, orrestartAggressive. Default:disabled. * -
ha_datastore_apd_response- (Optional) Controls the action to take on virtual machines when the cluster has detected loss to all paths to a relevant datastore. Can be one ofdisabled,warning,restartConservative, orrestartAggressive. Default:disabled. * -
ha_datastore_apd_recovery_action- (Optional) Controls the action to take on virtual machines if an APD status on an affected datastore clears in the middle of an APD event. Can be one ofnoneorreset. Default:none. * -
ha_datastore_apd_response_delay- (Optional) Controls the delay in minutes to wait after an APD timeout event to execute the response action defined inha_datastore_apd_response. Default:3minutes. *
HA virtual machine and application monitoring settings
The following settings illustrate the options that can be set to work with virtual machine and application monitoring on vSphere HA.
-
ha_vm_monitoring- (Optional) The type of virtual machine monitoring to use when HA is enabled in the cluster. Can be one ofvmMonitoringDisabled,vmMonitoringOnly, orvmAndAppMonitoring. Default:vmMonitoringDisabled. -
ha_vm_failure_interval- (Optional) If a heartbeat from a virtual machine is not received within this configured interval, the virtual machine is marked as failed. The value is in seconds. Default:30. -
ha_vm_minimum_uptime- (Optional) The time, in seconds, that HA waits after powering on a virtual machine before monitoring for heartbeats. Default:120(2 minutes). -
ha_vm_maximum_resets- (Optional) The maximum number of resets that HA will perform to a virtual machine when responding to a failure event. Default:3 -
ha_vm_maximum_failure_window- (Optional) The length of the reset window in whichha_vm_maximum_resetscan operate. When this window expires, no more resets are attempted regardless of the setting configured inha_vm_maximum_resets.-1means no window, meaning an unlimited reset time is allotted. The value is specified in seconds. Default:-1(no window).
vSphere HA Admission Control settings
The following settings control vSphere HA Admission Control, which controls whether or not specific VM operations are permitted in the cluster in order to protect the reliability of the cluster. Based on the constraints defined in these settings, operations such as power on or migration operations may be blocked to ensure that enough capacity remains to react to host failures.
Admission control modes
The ha_admission_control_policy parameter controls the specific mode that Admission Control uses. What settings are available depends on the admission control mode:
-
Cluster resource percentage: This is the default admission control mode, and allows you to specify a percentage of the cluster's CPU and memory resources to reserve as spare capacity, or have these settings automatically determined by failure tolerance levels. To use, set
ha_admission_control_policytoresourcePercentage. -
Slot Policy (powered-on VMs): This allows the definition of a virtual machine "slot", which is a set amount of CPU and memory resources that should represent the size of an average virtual machine in the cluster. To use, set
ha_admission_control_policytoslotPolicy. -
Dedicated failover hosts: This allows the reservation of dedicated failover hosts. Admission Control will block access to these hosts for normal operation to ensure that they are available for failover events. In the event that a dedicated host does not enough capacity, hosts that are not part of the dedicated pool will still be used for overflow if possible. To use, set
ha_admission_control_policytofailoverHosts.
It is also possible to disable Admission Control by setting ha_admission_control_policy to disabled, however this is not recommended as it can lead to issues with cluster capacity, and instability with vSphere HA.
-
ha_admission_control_policy- (Optional) The type of admission control policy to use with vSphere HA. Can be one ofresourcePercentage,slotPolicy,failoverHosts, ordisabled. Default:resourcePercentage.
Common Admission Control settings
The following settings are available for all Admission Control modes, but will infer different meanings in each mode.
-
ha_admission_control_host_failure_tolerance- (Optional) The maximum number of failed hosts that admission control tolerates when making decisions on whether to permit virtual machine operations. The maximum is one less than the number of hosts in the cluster. Default:1. * -
ha_admission_control_performance_tolerance- (Optional) The percentage of resource reduction that a cluster of virtual machines can tolerate in case of a failover. A value of 0 produces warnings only, whereas a value of 100 disables the setting. Default:100(disabled).
Admission Control settings for resource percentage mode
The following settings control specific settings for Admission Control when resourcePercentage is selected in ha_admission_control_policy.
-
ha_admission_control_resource_percentage_auto_compute- (Optional) Automatically determine available resource percentages by subtracting the average number of host resources represented by theha_admission_control_host_failure_tolerancesetting from the total amount of resources in the cluster. Disable to supply user-defined values. Default:true. * -
ha_admission_control_resource_percentage_cpu- (Optional) Controls the user-defined percentage of CPU resources in the cluster to reserve for failover. Default:100. -
ha_admission_control_resource_percentage_memory- (Optional) Controls the user-defined percentage of memory resources in the cluster to reserve for failover. Default:100.
Admission Control settings for slot policy mode
The following settings control specific settings for Admission Control when slotPolicy is selected in ha_admission_control_policy.
-
ha_admission_control_slot_policy_use_explicit_size- (Optional) Controls whether or not you wish to supply explicit values to CPU and memory slot sizes. The default isfalse, which tells vSphere to gather a automatic average based on all powered-on virtual machines currently in the cluster. -
ha_admission_control_slot_policy_explicit_cpu- (Optional) Controls the user-defined CPU slot size, in MHz. Default:32. -
ha_admission_control_slot_policy_explicit_memory- (Optional) Controls the user-defined memory slot size, in MB. Default:100.
Admission Control settings for dedicated failover host mode
The following settings control specific settings for Admission Control when failoverHosts is selected in ha_admission_control_policy.
-
ha_admission_control_failover_host_system_ids- (Optional) Defines the managed object IDs of hosts to use as dedicated failover hosts. These hosts are kept as available as possible - admission control will block access to the host, and DRS will ignore the host when making recommendations.
vSphere HA datastore settings
vSphere HA uses datastore heartbeating to determine the health of a particular host. Depending on how your datastores are configured, the settings below may need to be altered to ensure that specific datastores are used over others.
If you require a user-defined list of datastores, ensure you select either userSelectedDs (for user selected only) or allFeasibleDsWithUserPreference (for automatic selection with preferred overrides) for the ha_heartbeat_datastore_policy setting.
-
ha_heartbeat_datastore_policy- (Optional) The selection policy for HA heartbeat datastores. Can be one ofallFeasibleDs,userSelectedDs, orallFeasibleDsWithUserPreference. Default:allFeasibleDsWithUserPreference. -
ha_heartbeat_datastore_ids- (Optional) The list of managed object IDs for preferred datastores to use for HA heartbeating. This setting is only useful whenha_heartbeat_datastore_policyis set to eitheruserSelectedDsorallFeasibleDsWithUserPreference.
Proactive HA settings
The following settings pertain to Proactive HA, an advanced feature of vSphere HA that allows the cluster to get data from external providers and make decisions based on the data reported.
Working with Proactive HA is outside the scope of this document. For more details, see the referenced link in the above paragraph.
-
proactive_ha_enabled- (Optional) Enables Proactive HA. Default:false. * -
proactive_ha_automation_level- (Optional) Determines how the host quarantine, maintenance mode, or virtual machine migration recommendations made by proactive HA are to be handled. Can be one ofAutomatedorManual. Default:Manual. * -
proactive_ha_moderate_remediation- (Optional) The configured remediation for moderately degraded hosts. Can be one ofMaintenanceModeorQuarantineMode. Note that this cannot be set toMaintenanceModewhenproactive_ha_severe_remediationis set toQuarantineMode. Default:QuarantineMode. * -
proactive_ha_severe_remediation- (Optional) The configured remediation for severely degraded hosts. Can be one ofMaintenanceModeorQuarantineMode. Note that this cannot be set toQuarantineModewhenproactive_ha_moderate_remediationis set toMaintenanceMode. Default:QuarantineMode. * -
proactive_ha_provider_ids- (Optional) The list of IDs for health update providers configured for this cluster. *
Attribute Reference
The following attributes are exported:
-
id: The managed object ID of the cluster. -
resource_pool_idThe managed object ID of the primary resource pool for this cluster. This can be passed directly to theresource_pool_idattribute of thevsphere_virtual_machineresource.
Importing
An existing cluster can be imported into this resource via the path to the cluster, via the following command:
terraform import vsphere_compute_cluster.compute_cluster /dc1/host/compute-cluster
The above would import the cluster named compute-cluster that is located in the dc1 datacenter.
vSphere Version Requirements
A large number of settings in the vsphere_compute_cluster resource require a specific version of vSphere to function. Rather than include warnings at every setting or section, these settings are documented below. Note that this list is for cluster-specific attributes only, and does not include the tags parameter, which requires vSphere 6.0 or higher across all resources that can be tagged.
All settings are footnoted by an asterisk (*) in their specific section in the documentation, which takes you here.
Settings that require vSphere version 6.0 or higher
These settings require vSphere 6.0 or higher:
-
ha_datastore_apd_recovery_action -
ha_datastore_apd_response -
ha_datastore_apd_response_delay -
ha_datastore_pdl_response -
ha_vm_component_protection
Settings that require vSphere version 6.5 or higher
These settings require vSphere 6.5 or higher:
-
drs_enable_predictive_drs -
ha_admission_control_host_failure_tolerance(Whenha_admission_control_policyis set toresourcePercentageorslotPolicy. Permitted in all versions underfailoverHosts) -
ha_admission_control_resource_percentage_auto_compute -
ha_vm_restart_timeout -
ha_vm_dependency_restart_condition -
ha_vm_restart_additional_delay -
proactive_ha_automation_level -
proactive_ha_enabled -
proactive_ha_moderate_remediation -
proactive_ha_provider_ids -
proactive_ha_severe_remediation
© 2018 HashiCorpLicensed under the MPL 2.0 License.
https://www.terraform.io/docs/providers/vsphere/r/compute_cluster.html