1 DIMM RAS event found for P1-DIMMD1 on host 10.X.X.X in last 24 hours. Threshold : 1. Installed BIOS version is PU42.300{Error}

1 DIMM RAS EVENT FOUND FOR P1-DIMMD1 ON HOST 10.X.X.X IN LAST 24 HOURS. THRESHOLD : 1. INSTALLED BIOS VERSION IS PU42.300{ERROR}

Issue :-

Memory is failing alert will be in prism Element

Solution :-

If this is the latest version of BMC and BIOS(PU42.300) the steps to resolve this issue are as follows.

  1. Enter the affected host in maintenance mode.

2) Shutdown CVM and Reboot the host to allow PPR (Post Package Repair) to be automatically performed. We can do this without downtime on cluster.

3) Confirm PPR was successful in the IPMI SEL log. Download and share the same. If PPR was not successful we will replace the DIMM indicated in the alert In a 3-month timeframe, if RAS enables on the same DIMM that has already had 1 PPR cycle performed against it, we will replace the DIMM.

  1. If BIOS version is P[X]42.300 or newer, reboot the node(host) to automatically trigger DIMM repair.

2. If BIOS version is P[X]42.300 or newer & post CECC are detected – Contact Nutanix Support. If BIOS version is P[X]41.002 & P[X]42.002.

Nutanix IPMI login
Nutanix IPMI login

Once the PPR(Post Package Repair) is completed you can see in IPMI log.

Memory Error repair

You can access

To find the IPMI address with CLI or in prism Element :

  • SSH or take putty to one of the CVM
  • Run $ipmiips from the CVM console and you should get a list of the IPMI addresses.
  • You can enter one of the IPs from the above command in a browser and that should take you to the IPMI page for that host.
  • The default login Username ADMIN and Password ADMIN.
  • Using Prism Element console
  • Access the Prism Element console using the virtual IP or any CVM IP address in browser with 9440 port
  • Goto hardware tab
  • Select node of which you want to see IPMI IP
  • Click on IP it will open in new tab
  • access the IPMI conole

Leave a Reply