Executive summary:
- Every tool I've tried confirms lots of inodes in use on this ext4 partition
- Every tool I've tried shows me that there are no files on the partition
- It's not files held open and it's not an overlay mount
Long story:
I have an SSD with a single ext4 partition. This drive was being used to continually store video from cameras, in short clips, and a cron job would periodically delete the oldest clips (in a C application, which deleted them by calling remove()). After a while someone noticed that while there should have been about 5 days' worth of video backed up, there was hardly any, but the drive was almost full.
I took a look and naively tried just removing lost+found, but the drive was still full. So, I deleted everything (rm -rf *), but df -i tells me that 91230 inodes are in use, even though ls and du show nothing at all.
e2fsck -fv found no errors to fix (aside from creating lost+found again), and dumpe2fs and tune2fs -l both agree with df -i on the number of used inodes. I've tried e2fsck -b with a couple of the backup super-blocks and it didn't seem to make any difference.
baobab shows the same used space as df in the summary view, but when I click on the partition to see where the space is used, it only shows the 4.1kB used by the empty lost+found directory.
The problem is not that deleted file handles that are still open - nothing is open. I've mounted and unmounted this partition multiple times, and even taken the drive out and put it in a completely different machine.
I know I could just re-format the partition and start fresh, but I would really like to understand what's going on here and whether there's any "proper" way to fix this - I don't care whether it brings the files for those inodes back or it makes them properly deleted so they don't use up all the space.
Edit:
Running dump creates a backup file roughly equal in size to the used space reported by df et al. Then running restore to a different drive created a chain of directories that's clearly wrong (/media/usb0/20150426/10/1_20150426_100125.264/20150426/10/1_20150426_100125.264/ and it continues many levels deep, the same structure repeating), before printing a bunch of lines like:
expected next file 7823361, got 7610674
expected next file 7823361, got 7610675
(second number incrementing - it goes back well beyond my terminal's buffer) before finally:
cannot find directory inode 11
abort? [yn]
Choosing n results in more "cannot find directory node x", so I aborted.
Giving up and writing this off as a freak file-system corruption which hopefully won't happen again.