9

kernel: EDAC MC0: UE page 0x0, offset 0x0, grain 0, row 7, labels ":": i3200 UE

All of a sudden today, our CentOS release 6.4 (Final) system started throwing EDAC errors. I rebooted, and the errors stopped.

I have been searching for answers, but they fall into two camps, memory or a chipset. I would like some advice on where to search further to narrow this down to chipset or memory.

Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
octopusgrabbus
  • 556
  • 2
  • 7
  • 25

1 Answers1

10

What you're experiencing is an Error Detection and Correction event. Given the error includes this bit: MC0 you're experiencing a memory error. This message is telling you where specifically you're experiencing the error. MC0 means the RAM in the first socket (#0). The rest of that message is telling you specifically within that RAM DIMM the error occurred.

Given you're getting just one, I would continue to monitor it but do nothing for the time being. If it continues then you most likely are experiencing a failing memory module.

You could also try to test it more thoroughly using memtest86+.

This previous question titled: How to blacklist a correct bad RAM sector according to MemTest86+ error imdocation? will show you how to blacklist the memory if you're interested in that as well.

slm
  • 363,520
  • 117
  • 767
  • 871
  • For completeness, note that there are interactions between BIOS bugs and the kernel in this area which may lead to spurious results on i32xx chipsets: https://bugzilla.redhat.com/show_bug.cgi?id=564274 – Adrian Cox Jul 24 '14 at 09:32