4

It isn't the first time I have this kind of error. I was using my computer (Ubuntu 3.2) when my file permissions turned to read-only, so I restarted the computer, it wasn't finishing turning on, so I forced to turn it off (hard power off), and I was turning it on again, and I saw this:

/dev/sda2 contains a file system with errors, check forced.
Inodes that were part of a corrupted orphan linked list found.

/dev/sda2: UNEXPECTED INCOSISTENCY; RUN fsck MANUALLY.
          (i.e., without -a or -p options)
fsck exited with status code 4
The root filesystem on /dev/sda2 requires a manual fsck


BusyBox v1.27.2 (Ubuntu 1:1.27.2-2ubuntu3.2) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) 

I wrote fsck /dev/sda2 (with -a because i don't know how to do it without that), I am worried because this is not the first time, maybe the hard-disk is broken or I don't know. Maybe this is not the best place for my problem. Will you tell me what to do? What is it next? I want to learn, so your suggestions are too helpful.

P.D.: Sorry for my English

I wrote fsck /dev/sda2, and the next was:

fsck from util-linux 2.31.1
e2fsck 1.44.1 (24-Mar-2018)
/dev/sda2 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inodes that were of a corrupted orphan linked list found. Fix<y>?

I put y and the screen prints:

Inode 18874440 was part of the orphaned inode list. FIXED
Inode 18874445 was part of the orphaned inode list. FIXED
Inode 18874466 was part of the orphaned inode list. FIXED

And 20 lines of that, the following was

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(2494976--2495023) - (so many numbers)....
Fix<y>? yes
Free blocks count wrong for group #76 (19099, counted 19147).
Fix<y>? yes
Free blocks count wrong for group #81 (30339, counted=30365)
Fix<y>? yes
Free blocks count wrong for group #1577 (28430, counted=28437)
Fix<y>? yes
...

Finally

/dev/sda2: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda2: 541701/61022208 files (0.4% non-contiguous),  18674207/244059136 blocks

After of this, I wrote reboot and the computer came back to work. But this will happens again, and I need stop it.

I need some book to invetigate this, i want to learn why this happens.


Update: It happens again, when I was using Mozilla Firefox, the files' permissions were turned to read only. Then I restart the pc, and it doesn't finish. I will have to restart it and use fsck again.


Output of sudo smartctl -a /dev/sda :

smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.3.0-42-generic] (local build) 
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org 

=== START OF INFORMATION SECTION === 
Device Model:     TOSHIBA MQ04ABF100 
Serial Number:    X8JBP3K2T 
LU WWN Device Id: 5 000039 8d268714d 
Firmware Version: JU001J 
User Capacity:    1.000.204.886.016 bytes [1,00 TB] 
Sector Sizes:     512 bytes logical, 4096 bytes physical 
Rotation Rate:    5400 rpm 
Form Factor:      2.5 inches 
Device is:        Not in smartctl database [for details use: -P showall] 
ATA Version is:   ACS-3 T13/2161-D revision 5 
SATA Version is:  SATA >3.2 (0x1ff), 6.0 Gb/s (current: 6.0 Gb/s) 
Local Time is:    Sun Mar 29 03:48:34 2020 -03 
SMART support is: Available - device has SMART capability. 
SMART support is: Enabled 

=== START OF READ SMART DATA SECTION === 
SMART overall-health self-assessment test result: PASSED 

General SMART Values: 
Offline data collection status:  (0x00) Offline data collection activity 
                    was never started. 
                    Auto Offline Data Collection: Disabled. 
Self-test execution status:      (   0) The previous self-test routine completed 
                    without error or no self-test has ever  
                    been run. 
Total time to complete Offline  
data collection:        (  120) seconds. 
Offline data collection 
capabilities:            (0x5b) SMART execute Offline immediate. 
                    Auto Offline data collection on/off support. 
                    Suspend Offline collection upon new 
                    command. 
                    Offline surface scan supported. 
                    Self-test supported. 
                    No Conveyance Self-test supported. 
                    Selective Self-test supported. 
SMART capabilities:            (0x0003) Saves SMART data before entering 
                    power-saving mode. 
                    Supports SMART auto save timer. 
Error logging capability:        (0x01) Error logging supported. 
                    General Purpose Logging supported. 
Short self-test routine  
recommended polling time:    (   2) minutes. 
Extended self-test routine 
recommended polling time:    ( 172) minutes. 
SCT capabilities:          (0x003d) SCT Status supported. 
                    SCT Error Recovery Control supported. 
                    SCT Feature Control supported. 
                    SCT Data Table supported. 

SMART Attributes Data Structure revision number: 16 
Vendor Specific SMART Attributes with Thresholds: 
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE 
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0 
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0 
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       1348 
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       596 
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       0 
  7 Seek_Error_Rate         0x000b   100   095   050    Pre-fail  Always       -       0 
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0 
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -       1544 
 10 Spin_Retry_Count        0x0033   111   100   030    Pre-fail  Always       -       0 
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       421 
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       1265 
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       20 
193 Load_Cycle_Count        0x0032   096   096   000    Old_age   Always       -       48581 
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       35 (Min/Max 13/45) 
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0 
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0 
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0 
199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       0 
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       0 
222 Loaded_Hours            0x0032   097   097   000    Old_age   Always       -       1346 
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0 
224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0 
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       260 
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0 

SMART Error Log Version: 1 
ATA Error Count: 1 
    CR = Command Register [HEX] 
    FR = Features Register [HEX] 
    SC = Sector Count Register [HEX] 
    SN = Sector Number Register [HEX] 
    CL = Cylinder Low Register [HEX] 
    CH = Cylinder High Register [HEX] 
    DH = Device/Head Register [HEX] 
    DC = Device Command Register [HEX] 
    ER = Error register [HEX] 
    ST = Status register [HEX] 
Powered_Up_Time is measured from power on, and printed as 
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, 
SS=sec, and sss=millisec. It "wraps" after 49.710 days. 

Error 1 occurred at disk power-on lifetime: 1096 hours (45 days + 16 hours) 
  When the command that caused the error occurred, the device was active or idle. 

  After command completion occurred, registers were: 
  ER ST SC SN CL CH DH 
  -- -- -- -- -- -- -- 
  04 31 00 02 59 d7 a9 

  Commands leading to the command that caused the error were: 
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name 
  -- -- -- -- -- -- -- --  ----------------  -------------------- 
  ea 00 00 00 00 00 a0 00      05:01:42.830  FLUSH CACHE EXT 
  61 58 30 38 f9 20 40 00      05:01:42.830  WRITE FPDMA QUEUED 
  61 08 80 08 f9 1b 40 00      05:01:42.829  WRITE FPDMA QUEUED 
  61 08 78 a0 b8 52 40 00      05:01:42.788  WRITE FPDMA QUEUED 
  61 30 70 58 0c 50 40 00      05:01:42.788  WRITE FPDMA QUEUED 

SMART Self-test log structure revision number 1 
No self-tests have been logged.  [To run self-tests, use: smartctl -t] 

SMART Selective self-test log data structure revision number 1 
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS 
    1        0        0  Not_testing 
    2        0        0  Not_testing 
    3        0        0  Not_testing 
    4        0        0  Not_testing 
    5        0        0  Not_testing 
Selective self-test flags (0x0): 
  After scanning selected spans, do NOT read-scan remainder of disk. 
If Selective self-test is pending on power-up, resume after 0 minute delay. 
Boris Valderrama
  • 152
  • 1
  • 13
  • 1
    Add the output of `sudo smartctl -a /dev/sda` to your question, you need to install the `smartmontools` package if the command is missing. – Freddy Mar 29 '20 at 04:38
  • @Freddy It is done – Boris Valderrama Mar 29 '20 at 06:55
  • Does this answer your question? [What should I do to force the root filesystem check (and optionally a fix) at boot?](https://unix.stackexchange.com/questions/400851/what-should-i-do-to-force-the-root-filesystem-check-and-optionally-a-fix-at-bo) – Vlastimil Burián Mar 29 '20 at 07:06
  • Apart from the single logged error your disk looks healthy. The problem appears to be somewhere else. – Freddy Mar 29 '20 at 08:06
  • @Freddy I'm not sure `smartmontools` can be 100% relied upon. I had a similar situation in the past where it said everything was fine, but in reality the disk was toast. – Time4Tea Mar 29 '20 at 08:54
  • 3
    El Borito how do you switch off your system when you've finished with it? – roaima Mar 29 '20 at 13:39
  • @roaima You remind me that I turned off the pc again (forced shutdown), becuase it not turned on. I forgot that detail. But it is not the first time, I turn off my computer in the right way. – Boris Valderrama Mar 30 '20 at 07:56
  • @LinuxSecurityFreak I don't understand how to apply it in this specific case – Boris Valderrama Mar 30 '20 at 07:57

3 Answers3

5

My advice: back up your data now, then replace the drive asap. I had a similar issue a couple of years ago with an old Macbook I was running Linux on. The root filesystem would suddenly go read-only and on reboot I had to run fsck every time to fix it. smartmontools indicated that everything was fine. I changed the hard disk and the problem went away immediately.

I'm not saying your problem is necessarily the same, but it sounds very similar to what I experienced. Definitely at least back up your data, as a precaution.

Time4Tea
  • 2,288
  • 5
  • 23
  • 54
  • 1
    Not convinced about this at all. Look at the Power On hours - 1544 is only two months and even at 8 hours/day it's only six months. – roaima Mar 29 '20 at 13:42
  • 1
    @roaima I'm not saying this is definitely the problem, but I'm relating an experience I had that seemed very similar. In my experience, `smartmontools` is not 100% reliable and can give false negatives. Besides, manufacturing defects, transportation damage can occur, even if the hours of use seems low. – Time4Tea Mar 29 '20 at 14:23
  • @Time4Tea Thanks for your response, at the moment i cannot change my hard disk and I want to learn about these things, because it is not my first pc with problems (the difference is this pc is "new"). I shall try to back up my data. Thanks :) – Boris Valderrama Mar 30 '20 at 08:07
4

The same problem happened to me, my Ubuntu 19.10 when restarting had to execute the command:

fsck -fy /dev/sda2

I started to analyze the logs /var/log/syslog and /var/log/kern.log and realized that always before corrupting the disk, information related to my battery was logged from the laptop-mode-tools software that I had installed the day before.

So I decided to uninstall such software and, in my case, it stopped giving me the problem of corrupting the disk.

3

I had same issue, I had dual booted ubuntu 20.04 with windows 10. On every start of ubuntu I face the same issue (fsck the partition with errors) and after some time I can't even log into ubuntu anymore.

But I can log into the windows and by analyzing (using the Hard Disk Sentinel Tool ) there I found the there are a lot of bad sectors on my hard disk (bad sector means a physical damage to hard disk and can't be recovered as far as I know).

I replaced my 1 TB HDD with 240 GB SSD and on new SSD I install only ubuntu 21.04 and the problem is gone. HDD of my system was 5+ year old and that's what the life of most of the HDD (I used it extensively as well, pretty much each day).

Take Away: First make sure the HDD/SSD is in good health.

  • 1
    Hi, I inserted my HDD into another computer (older) and the problem seems to be gone. It's pretty weird, one day i'm going to take a look with Hard Disk Sentinel Tool. Thank you – Boris Valderrama Oct 11 '21 at 16:33