Questions tagged [mcelog]

11 questions
7
votes
4 answers

Random restarts caused by a machine check exception

My laptop restarts randomly about twice a day. It shows the following error log before the restart. . Unfortunately I don't have an idea how to decode the Machine Check Exception (MCE). mcelog --ascii outputs nothing. Is there a chance that this is…
fhucho
  • 415
  • 2
  • 5
  • 10
5
votes
1 answer

Running `mcelog` on an AMD processor

When I run mcelog (version 154), I get the following output. mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead. CPU is unsupported This to me feels like a category error,…
Matthew Piziak
  • 362
  • 1
  • 3
  • 17
5
votes
2 answers

Understanding Machine Check Exceptions (MCE)

While trying to debug frequent freezes of my new laptop (KabyLake architecture) running Ubuntu 16.04 I've stumbled upon these entries in kern.log: kernel: [ 0.041634] mce: [Hardware Error]: Machine check events logged Since then I have installed…
justfortherec
  • 153
  • 1
  • 8
3
votes
1 answer

Writing triggers for mcelog

Just starting to look into mcelog for the first time (I've enabled it and seen syslog output before, but this is the first time I'm trying to do something non-default). I'm looking for information on how to write triggers for it. Specifically, I'm…
Bratchley
  • 16,684
  • 13
  • 64
  • 103
2
votes
1 answer

Emergency mode on Fedora 21 - mcelog

My Fedora 21 system boots directly into the emergency mode. There is the error message: unable to init device /dev/mcelog (rc -5) It says try: systemctl default to boot to default mode; this works. Then the machine reboots nicely. With the next…
Joel
  • 141
  • 1
  • 6
1
vote
0 answers

What is bus and interconnect error in mca?

In Intel Mache-check error, the MCA error code field [15:0] in IA32_MCi_STATUS register, we see different error types, ex: 000F 0000 0000 11LL – Generic cache hierarchy errors. 000F 0000 0001 TTLL – TLB errors. 000F 0000 1MMM CCCC – Memory…
Mark K
  • 779
  • 2
  • 13
  • 33
1
vote
0 answers

Need to call a script whenever edac error is thrown up by kernel/system

I need to call a script whenever the system generates EDAC errors. I created following UDEV rule for this purpose. If the ce_count changes then I would like to execute /var/tmp/test.sh, then I did udevadm control --reload-rules && udevadm trigger…
1
vote
0 answers

How do machine-check exception (MCE) work?

I want to understand how CPU performs this Machine checks and how does it detect if any device has failed? Does the device send any machine-check exception (MCE) event which the MCE daemon catches and reports? How does this works internally?
1
vote
0 answers

Mcelog Doesn't Work - openSUSE

I am trying to use mcelog in openSUSE. The proble is that while it is installed the command sudo mcelog --client does nothing, there is no /var/log/mcelog and systemctl status mcelog I get: ● mcelog.service - Machine Check Exception Logging Daemon …
Adam
  • 269
  • 1
  • 5
  • 16
0
votes
0 answers

CPU#X stuck for 22s!

I am having different hardware errors (see photo below, unfortunately my experience is they are not getting written to disk) with a computer. They occur most often when transferring data over the LAN, I was unable to generate them with…
user454911
0
votes
2 answers

correct way to use mcelog on latest versions of Debian

I found some documentation but it looks a bit outdated: http://www.cyberciti.biz/tips/linux-server-predicting-hardware-failure.html I see that I can specify a device as input but the documentation does not mention anything specific about that. How…
Reyx_0
  • 911
  • 3
  • 12
  • 23