0

I recently had couple of accidents with my disks formatted to etx4. To be honest, I believe the failure was on my side, because one of them was due to incorrect [manual] unmounting of flash card, and the other was related to electricity switched off. The net effect is I physically lost 128 GB flash card [with money related] and information on 2Tb HDD [with time related]. My main concern is that such damage NEVER happened to disks partitioned to NTFS, whatsoever.

My questions are:

  1. Is ext4 safe in general? I mean is it me or are there other people who experienced loss of disks/information on disks formatted to etx4?
  2. In Linux world, what could be a safer alternative to ext4 that can outlive unexpected electricity switch off or unexpected unmounting?
  • Possible duplicate of [Do journaling filesystems guarantee against corruption after a power failure?](https://unix.stackexchange.com/questions/12699/do-journaling-filesystems-guarantee-against-corruption-after-a-power-failure) – rudib Oct 10 '18 at 11:32
  • But, be aware that you **might** be able to recover lost data. – rudib Oct 10 '18 at 11:33
  • @rudib I lost flash card and info on the HDD – Sergey Bushmanov Oct 10 '18 at 11:35
  • Concerning `NTFS`: [it might be easier to recover](https://superuser.com/questions/194412/is-ntfs-fail-safe-in-case-of-a-power-outage?rq=1). – rudib Oct 10 '18 at 11:36
  • There are tools that may restore a corrupted `ext4` (and others) partition. For example, `fsck` could help. – rudib Oct 10 '18 at 11:41
  • @rudib I tried all of them, I lost flash card and info on HDD, this is why the question – Sergey Bushmanov Oct 10 '18 at 11:44
  • 2
    You can brick a flash memory device by removing power while it's writing. This is filesystem independent. Do not yank a pen drive out of the USB port until the OS has been told to eject it. – Mark Plotnick Oct 10 '18 at 11:49
  • I see, that's bad. Have a look at the links I previously commented. A few parts of your question have already been asked, and there are extensive answers. However, I can't really tell you which filesystem is the most resilient to power loss, but maybe someone knows. – rudib Oct 10 '18 at 11:52
  • I've been to `NTFS` world for more than 20 years. Never ever any accident happened. – Sergey Bushmanov Oct 10 '18 at 11:54
  • `NTFS` is different, it just journals metadata, not the data itself (which can also be very bad, depending on your situation). Have a look [at the link I posted before](https://superuser.com/questions/194412/is-ntfs-fail-safe-in-case-of-a-power-outage?rq=1), it should explain that extensively. – rudib Oct 10 '18 at 11:59
  • [In short](https://superuser.com/a/194484/479137) – rudib Oct 10 '18 at 11:59
  • @rudib The question is about Best filesystem in Lynux world – Sergey Bushmanov Oct 10 '18 at 12:02
  • There probably is no definitve answer to that, but [`ZFS`](https://unix.stackexchange.com/a/12720/112611) may be an option for you, as it handles power failures differently. – rudib Oct 10 '18 at 12:06
  • However, the data being written during the power failure is still lost, of course. – rudib Oct 10 '18 at 12:07

1 Answers1

4

My main concern is that such damage NEVER happened to disks partitioned to NTFS, whatsoever.

It may have never happened to you, but it has happened. The only filesystems that can claim things like this never happening are those that have never been exposed to such conditions. Even BTRFS and ZFS, which are both designed to be resilient against stuff like this, can have such issues.

To your actual questions though:

Is ext4 safe in general? I mean is it me or are there other people who experienced loss of disks/information on disks formatted to etx4?

It depends on what you mean by 'safe'. I've personally lost data on disks formatted with ext4, but every time it's happened to me it's been due to bad hardware, and, more importantly, it would have happened eventually with pretty much any other filesystem. Despite this, I do still use it for numerous things on a regular basis because, barring user error or hardware issues (which includes unexpected power loss), it just works. So, I consider it 'safe' by most people's definitions, but you may or may not.

In Linux world, what could be a safer alternative to ext4 that can outlive unexpected electricity switch off or unexpected unmounting?

No, not unless you want to deal with other limitations or issues. In particular:

  • XFS is a bit more resilient against unexpected power loss and doesn't need long checks on reboot like ext4 does, but has a number of practical limitations that make it questionable for small-scale use (can't shrink filesystems, performance isn't quite as good as ext4 on a new volume, can't do data journaling).
  • NILFS2 is almost impossible to kill with a power failure, but you might lose 30 or so seconds of changes, it requires a userspace component when mounting, and it is missing a handful of features that are generally considered standard by most Linux filesystems.
  • BTRFS will save you from failing hardware and reasonably reliably, plus it provides nice support for online replacement of failing disks, but again you may lose some of the most recent changes on an unexpected power loss, and you need to do a lot more to keep the volume healthy than for most other filesystems.
  • ZFS has all the benefits that BTRFS does with none of it's issues (except the management ones), but it requires you build a third-party kernel module and you won't get get any upstream support for any issues you have if you're not running on enterprise grade hardware.

You can, however, do a number of things to make ext4 safer:

  • Change the behavior when errors are encountered. By default, if an error is encountered in filesystem metadata, ext4 will just mark the volume as needing to be checked, and then act like nothing happened. It's the only filesystem on Linux that does this, everything else will remount the volume read-only, thus preventing any writes to the filesystem from making things worse. You can get this behavior on ext4 by adding errors=remount-ro to the mount options, or running tune2fs -e remount-ro on the block device containing the filesystem.
  • Make sure you're not using writeback mode for the journal. Yo can ensure this by double checking the mount options for the volume and making sure that journal=writeback is not in the list. Journal writeback mode can significantly improve the performance of certain workloads on ext4 filesystems, but it makes it much more likely that you lose data if you unexpectedly lose power.
  • If you want to be really paranoid about data safety, you can enable journaled data mode. Normally, the journal on an ext4 filesystem only tracks changes to metadata (renames, file deletion or creation, timestamp updates, etc). In journaled data mode, all changes go through the journal. This slows things down significantly, but provides a functionally 100% guarantee that the file system will remain internally consistent. You can enable this by passing journal=data in the mount options.
  • You can add the auto_da_alloc mount option. Essentially, this detects applications not calling fsync() when they should, and properly handles things. It's not the default because it slows things down a bit, and most applications don't need it.
  • On newer kernels, you can enable journal checksumming. This won't actually 'save' your data, but it will help ensure that you're not getting bogus data back when there was an error. This can be enabled by adding journal_checksum to the mount options.
  • If you've got a new enough kernel and version of e2fsprogs, you can enable metadata checksumming. Similar to the journal checksumming, this won't save your data, but it will help prevent you from seeing bogus data if there's an error. This has to be enabled at filesystem creation time, by passing -O metadata_checksum,metadata_checksum_seed to mkfs.ext4. If you do this, you (probably) don't need to also enable journal checksumming, as the journal is part of what gets covered by the metadata checksumming.
Austin Hemmelgarn
  • 11,401
  • 1
  • 24
  • 42
  • I was going to ask a question about installing some Linux Mint workstations that can't have a UPS, and my primary concern is data corruption that causes issues with the OS, and less with what may be otherwise lost. Your answer here seems like it is what I am looking for, or would you recommend something more or different? I guess I could still post my question? – Paul Jan 16 '22 at 22:03
  • Probably should have mentioned the locations experience just your typical brief power loss maybe twice per year. – Paul Jan 16 '22 at 22:10
  • @Paul It’s up to you if you want to post a new question for that. The answer here is largely still accurate (ext4 changes pretty slowly), but with a more detailed description of what exactly you want it would probably be possible to provide an answer that is better for your particular use case. – Austin Hemmelgarn Jan 17 '22 at 02:05