Nutanix VM high availability(HA)
Virtual Machine HA (VM HA) implements high availability at the hypervisor level by replicating and restarting full virtual machines.
If the host is a failure, the VMs previously running on that host will restart on other nodes throughout the cluster.
High Availability Types in Nutanix AHV
- Best effort:- No reservations of node or memory on the node are made in the cluster. In case of any failures, the virtual machines are moved to other nodes based on the resources/memory available on the node. This is not a preferred method as in case of no resources available on cluster/node; some of the virtual machines may not be powered-on.
To change the HA withing a cluster from the command line need to use acli.
nutanix@NTNX--A--CVM:~$ acli ha.update num_host_failures_to_tolerate=0
To get the HA states, use the command below.
nutanix@NTNX--A--CVM:~$ acli ha.get config { failover_enabled: True ha_state: "kAcropolisHABestEffort" logical_timestamp: 4 num_host_failures_to_tolerate: 0 num_remaining_host_failures_to_tolerate: 0 reservation_type: "kAcropolisHANoReserveReservations" }
- 2. Reserved Host:- A full node is reserved for HA for VM in case of any failure of nodes in the cluster. It does not allow Virtual Machines to be run/powered on or migrated to the node during the regular operation of the cluster. This mode only works if all the nodes in the cluster have the same amount of memory.
nutanix@NTNX--A--CVM:~$ acli ha.update reservation_type=kAcropolisHAReserveHosts num_host_failures_to_tolerate=1
To get the HA states, use the command below.
nutanix@NTNX--A--CVM:~$ acli ha.get config { failover_enabled: True ha_state: "kAcropolisHAHighlyAvailable" logical_timestamp: 30 num_host_failures_to_tolerate: 1 num_remaining_host_failures_to_tolerate: 1 reservation_type: "kAcropolisHAReserveHosts" reserved_host_uuids: "210032-e453-456f-8953f-344566895" }
- 3. Reserved Segments:- On each node some memory is reserved in the cluster for failover of virtual machines from a failed node. The acropolis service in the cluster calculates the memory to be reserved in the cluster based on the virtual machine memory configuration. All nodes are marked as scheduled and resources available for running VMs.
nutanix@NTNX--A--CVM:~$ acli ha.update reservation_type=kAcropolisHAReserveSegments num_host_failures_to_tolerate=1
To get the HA states, use the command below
nutanix@NTNX--A--CVM:~$ acli ha.get config { failover_enabled: True ha_state: "kAcropolisHAHighlyAvailable" logical_timestamp: 29 num_host_failures_to_tolerate: 1 num_remaining_host_failures_to_tolerate: 1 reservation_type: "kAcropolisHAReserveSegments" }
If the node fails We will see the below failures to tolerate 0
config { failover_enabled: True ha_state: "kAcropolisHABestEffort" <--- state changed to kAcropolisHABestEffort from kAcropolisHAHighlyAvailable logical_timestamp: 480 num_host_failures_to_tolerate: 1 num_remaining_host_failures_to_tolerate: 0 <--- no more host failures can be tolerated reservation_type: "kAcropolisHAReserveSegments" }
Once the node back online state will change to kAcrpolisHaHealing
config { failover_enabled: True ha_state: "kAcropolisHAHealing" <--- state changed to kAcropolisHAHealing logical_timestamp: 480 num_host_failures_to_tolerate: 1 num_remaining_host_failures_to_tolerate: 0 <--- no more host failures can be tolerated reservation_type: "kAcropolisHAReserveSegments" }
If you see an error
"Operation failed: kNoHostResources: No host has enough available memory." or "Could not reserve enough space to protect this VM against node failure."
nutanix@NTNX--A--CVM:~$ acli vm.on enable_migrations=true