Cluster Connectivity Status nutanix Critical alert

CLUSTER CONNECTIVITY STATUS NUTANIX CRITICAL ALERT

Alert Title

  • Source :- EntityCluster
  • Severity :- Critical
  • Created Time :-07/01/20, 02:54:26 PM
  • Last Occurred :- 07/01/20, 03:04:26 PM
  • Impact Type:-System Indicator
  • Policy :- Cluster Connectivity Status

Short description :-

IDF data from cluster {name of the cluster} is not up-to-date.

Reason :-

Cluster network connectivity or CVM services such as insights server, insights uploader, insights receiver, Aplos, or Prism gateway could be down.

Solution :-

NCC check:-

Run NCC check on Prism Central

nutanix@NTNX-192.168.4.X-A-PCVM:$ ncc health_checks run_all

If this NCC check is running from the PC (Prism Central) cluster, the source will be the PC cluster and the destination will be the PE (Prism Element) cluster.

pc_pe_time_drift_check

This check returns FAIL status when the heartbeat sync time between database on the source cluster and database on replica cluster crosses a predefined threshold value (600 Sec, by default).

here we can see the time drift of 610 seconds exists between this prism central and prism element cluster.

The impact is the data replicated from the source to the replica cluster is not up to date. This could be because some services may not be working as expected.

To resolve this issue

Run the below command to check connectivity between Prism Central and Prism Element

nutanix@NTNX-192.168.4.X-A-PCVM:$ ncli multicluster get-cluster-state
multicluster get-cluster-state

We can see Remote Connection Exists : true That means the connection between Prism Central and element is fine.

We need to check the time between the cvm cluster and prism central.

We can also run below command to check the Network and port connectivity

nutanix@NTNX-192.168.4.X-A-CVM:$ allssh 'echo \$ | nc -tv <PC_IP> 9440'
================== CVMA =================
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to <PC_IP>:9440.
Ncat: 2 bytes sent, 0 bytes received in 0.04 seconds.
================== CVMB =================
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to <PC_IP>:9440.
Ncat: 2 bytes sent, 0 bytes received in 0.04 seconds.
================== CVMC =================
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to <PC_IP>:9440.
Ncat: 2 bytes sent, 0 bytes received in 0.04 seconds.

We can see that the port and network is stable here.

Just restore the connectivity by running below command from Prism Element

nutanix@NTNX-192.168.4.X-A-CVM:$nuclei remote_connection.reset_pe_pc_remoteconnection

IDF DB will automatically resync after the connectivity issues between the PE cluster and the PC cluster are resolved.

Also See :-

LCM Operation Failed

Leave a Reply