2

I am running Debian 6.0 Squeeze. Every half hour or so a dialog pops up on the side of my screen with the message -

Kernel failure. A report has been sent to the developers to help them fix the error. (Not the exact text but that's what it says)

Occasionally, say once a day it freezes and then reboots completely causing me to loose all my work.

What is happening?

Note - I used a USB stick and a Live Usb maker to install the system. Could that be a reason for the problem?

Rui F Ribeiro
  • 55,929
  • 26
  • 146
  • 227
Kshitiz Sharma
  • 8,585
  • 21
  • 59
  • 75
  • Probably installing from usb isn't a problem. In your case I would try to reinstall with different distro, and if problem stays I would check hardware. Have you checked the logs? – blogger Aug 23 '12 at 10:33
  • @blogger No I haven't. Where are the logs? And what should I be looking for? – Kshitiz Sharma Aug 23 '12 at 11:04
  • 1
    the most important log to look at for this kind of problem would be /var/log/kern.log. Unfortunately, if the system freezes, the triggering event may not get logged or the log entries may not be written to disk. remote syslogging to another machine on the network, a serial terminal, a line printer, or even a camera can be useful for capturing the log information. As for what to look for, start looking for any error messages, mentions of drivers or hardware. – cas Aug 23 '12 at 11:58

4 Answers4

8

I have noticed for some reason (and whether this is true or not, I'm not sure) that Linux is more sensitive to failing hardware. I have seen this on my home office computer a couple of times. Your best bet is to start running hardware diagnostics.

For that I would recommend Ultimate Boot CD. In your case, I would start with running a Memtest (at least for an hour), followed by a hard drive test (which test will depend on the brand of your hard drive). Out of those two, I would bet a lot of money that something would show up defective - and my money would be on memory.

  • @AlanPhilips I have my system in dual boot with Windows 7 and it works just fine. So I doubt hardware would be the culprit. But I'll try that anyway. – Kshitiz Sharma Aug 23 '12 at 11:31
  • I definitely would. Mine was dual booting XP and Squeeze, and turned out to have a bad stick of memory. Usually in a memtest, memory shows up bad after a few seconds - this one took 35 minutes to show up. Thats why I suggested running it for at least an hour. Good luck! – Alan Phillips Aug 23 '12 at 11:38
  • 1
    even an hour may not be enough. i'd suggest leaving memtest running at least overnight. – cas Aug 23 '12 at 11:51
  • @CraigSanders I would agree. – Alan Phillips Aug 23 '12 at 11:52
3

Some possibilities:

  • As Alan suggested, bad memory is a common cause of problems.
  • bad power-supplies can also cause random freezes and crashes.
  • low-quality motherboard. either due to shoddy manufacturing or due to bad/dodgy parts (e.g. a sub-standard or cheap version of a NIC that claims to be a particular brand/model but isn't - the manufacturer's Windows driver may compensate for its inadequacies but the linux driver believes it is an XYZ device because that's what it claims to be)
  • ditto for expansion cards

Are there any common patterns to the crashes? For example:

  • does it happen more often when you do certain things or run particular programs (if so, what are they?)
  • or after you've visited certain websites (e.g. badly written javascript code can leak memory like a sieve)
  • or at a certain times of day (when?)
  • or when other equipment is being operated nearby (e.g. a fridge motor turning on - a good UPS can protect against transient voltage fluctuations).
cas
  • 1
  • 7
  • 119
  • 185
0

If that dialog pops up and the system is still responding, you can run dmesg from a terminal to see the kernel's messages, which will show the error.

Jim Paris
  • 14,137
  • 5
  • 36
  • 35
0

I've noticed when trying to run Linux from a USB flash drive that on some PC's (seems to be mostly older Dell's) that, after a while, something happens where the system thinks that the drive is disconnected, even though it is not physically disconnected.

I have an old Inspiron laptop, for example, where, after about a day or so running from a USB stick, this occurs and everything crashes because suddenly Linux can't find its root volume.

I didn't do enough troubleshooting to determine if it was the flash drive (a 4GB Kingston), the fact that it is a flash drive and not a USB-enclosed hard disk, etc. but have seen this on other Dell PCs. I don't know if it's a subtle problem in the chipset the Linux drivers don't account for, some possible interaction with ACPI, or what.

Several years ago when I was using an old HP Pavillion as a server, I would have issues where USB attached drives would stop being recognized as connected. Only physically disconnecting and reconnecting them would get them to be recognized by Linux again. I was using a Belkin USB 2.0 PCI card at the time. I've placed the same card in a Dell Poweredge 2500 and have run drives for months off of it with no issues.

You may try partitioning your hard drive, or installing a second hard drive in your system, and running Linux from that.

LawrenceC
  • 10,884
  • 4
  • 33
  • 45