I have a computer with two hard drives in it. One carries the OS and a whole bunch of other stuff; the other was mounted inside /media as a dump-space for an additional terabyte of storage space. Recently, I upgraded the system from Ubuntu Maverick to Debian Jessie, which involved the removal of a bunch of incompatible packages and the installation of a bunch more, and could have broken stuff; it's also possible that the hard drive was dying, and when I rebooted, it decided to give up.
There's nothing utterly crucial on this drive, but I would like to retrieve it rather than rebuild - also, I prefer to know what went wrong, rather than just mask the problem and move on. So what I'm asking for is recommendations on how to debug a strange hard drive failure in which a hard drive's file system is no longer recognized. If the question isn't appropriate here, I apologize, and please redirect me to a better place!
Prior to the upgrade, the primary drive was (I believe) /dev/sda, and the secondary was /dev/sdb. Now, they show up as:
$ sudo parted -l
Model: ATA WDC WD1002FAEX-0 (scsi)
Disk /dev/sda: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 32.3kB 1000GB 1000GB primary
Model: ATA ST31000333AS (scsi)
Disk /dev/sdb: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 1049kB 997GB 997GB primary ext4 boot
2 997GB 1000GB 3143MB extended
5 997GB 1000GB 3143MB logical linux-swap(v1)
Note that the file systems on /dev/sdb show up correctly (most of the terabyte as ext4, bootable, and 3GB swap partition), and the partitions on /dev/sda are most likely correct (though I don't have pre-upgrade partition table dumps), but with no file systems listed.
Attempting to fsck /dev/sda1 produces this error:
$ sudo fsck /dev/sda1
fsck from util-linux 2.25.2
e2fsck 1.42.12 (29-Aug-2014)
ext2fs_open2: Bad magic number in super-block
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/sda1
The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem. If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
or
e2fsck -b 32768 <device>
A list of likely superblock options produced no helpful results, but I'm wondering if maybe there's a partition table offset. Is there a way to search for a valid superblock?
Also, I'm not 100% certain that this was an ext3/ext4 file system. It's possible that I used a different file system, but I don't know what. Is there any way to explore the partition and figure out what it would be using, such that I can install an additional file system driver?
Any pointers would be a help. Thanks!
EDIT: doktor5000 suggested I grab testdisk and see what it says.
Disk /dev/sda - 1000 GB / 931 GiB - CHS 121601 255 63
Current partition structure:
Partition Start End Size in sectors
No ext2, JFS, Reiser, cramfs or XFS marker
1 P Linux 0 1 1 121600 254 63 1953520002
1 P Linux 0 1 1 121600 254 63 1953520002
No partition is bootable
Selecting 'Quick Search' produces this:
Disk /dev/sda - 1000 GB / 931 GiB - CHS 121601 255 63
Partition Start End Size in sectors
>* Linux 0 32 33 118619 237 18 1905627136
P Linux Swap 118620 14 51 121601 57 56 47892480
I previously forgot to quote what fdisk said, so here's that:
$ sudo fdisk /dev/sda -l
Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x000a56a5
Device Boot Start End Sectors Size Id Type
/dev/sda1 63 1953520064 1953520002 931.5G 83 Linux
So, the discrepancies I'm seeing are: fdisk reckons the partition starts at 63, but testdisk says 32 (I think); and testdisk says that there's a swap partition there, which doesn't make sense to me (it's a secondary drive, I don't know why I'd have allocated any swap space). But, coolness of coolnesses, I can dig into the file system and copy files off! This is awesome compared to what I had to work with the last time I tried disk recovery - but then, that was back in the 1990s using OS/2, so no surprises there :)
I'm confident enough to let testdisk write out a new partition table. And yep! All the data's there and readable, the drive appears to be working just fine. Many thanks, doktor5000! So, follow-up question... any idea how the partition table could have came to be corrupted?