MCE memory errors on ovirt-srv07

Description

ovirt-srv07 reports faulty memory. Need to contact the vendor and replace it as the server is under warranty.

[13637.683809] mce: [Hardware Error]: Machine check events logged
[13637.683840] EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
[13637.683845] EDAC sbridge MC1: CPU 1: Machine Check Event: 0 Bank 9: 8c00004a000800c0
[13637.683848] EDAC sbridge MC1: TSC 0
[13637.683850] EDAC sbridge MC1: ADDR 13e37f9000
[13637.683856] EDAC sbridge MC1: MISC 90004000400148c
[13637.683859] EDAC sbridge MC1: PROCESSOR 0:306e4 TIME 1488373079 SOCKET 1 APIC 20
[13637.800540] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0x13e37f9 offset:0x0 grain:32 syndrome:0x0 - area😃RAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)

Activity

Show:

Former user March 16, 2017 at 4:10 PM

DIMM replaced successfully

Former user March 15, 2017 at 9:25 AM

Replacement DIMM is ready, server shut down for maintenance.

Former user March 7, 2017 at 6:20 PM

Replacement DIMM ordered, waiting for it to arrive for replacement. Related tickets logged with server operations.

Former user March 7, 2017 at 3:39 PM

Server is out of warranty along with a bunch of others. Opened https://ovirt-jira.atlassian.net/browse/OVIRT-1233#icft=OVIRT-1233 to discuss a solution.

Fixed

Details

Assignee

Reporter

Blocked By

Components

Priority

Created March 3, 2017 at 10:33 AM
Updated April 2, 2017 at 12:51 PM
Resolved March 16, 2017 at 4:10 PM

Flag notifications