1

On a running linux box, cached content in a filesystem, should be written to disk.

  • "sync": with the command line sync and related system call, it can be made that data is persisted/written out to the storage device
  • "fsfreeze": the command fsfreeze, seems on the other hand:

    fsfreeze halts new access to the filesystem and creates a stable image on disk.

So it seems that with either command a "stable on-disk" representation of the data at the time of issueing the command is achieved.

Clearly the fsfreeze, is described as -additionally- suspending further "new access".

Now this question seeks information if for the purpose of having a coherent disk image (i.e. for the backing copying the on disk filesystem data), is there any difference between using sync and fsfreeze.

I assume that to answer this, question it might be necessary to consider the filesystem used, because different filesystems have different ways of assuring (or not) that intermediate states are atomicly commited to disk.

Personal testing has shown, that fsfreeze and btrfs filesystem did always end up in a unresponsive console, requiring a hard reset. Sync on the other hand did not (no irony intented) freeze the system.

humanityANDpeace
  • 13,722
  • 13
  • 61
  • 107

1 Answers1

1

So it seems that with either command a "stable on-disk" representation of the data at the time of issueing the command is achieved.

Yes, but in sync’s case, that “stable on-disk” state is potentially very short-lived — any change made after the sync is issued can make the file system inconsistent again.

fsfreeze uses FIFREEZE with no timeout. As you’ve discovered, this can result in a frozen system, since no writes can proceed on the affected file systems — and writes are issued in a huge variety of circumstances (e.g. writing your shell’s history). There’s an emergency thaw key combination, SysRqj, which you can use (unless it’s been disabled).

The point of FIFREEZE is that, while it’s in effect, you can read the storage underlying the frozen file system and build a consistent image of the storage — i.e. one in which the data and metadata are fully in sync.

With sync only, and changes in flight at the time of the sync will be on disk when it completes, but copying the underlying storage then won’t necessarily result in a consistent image, since subsequent writes may have started hitting the storage. Copying such an image should allow you to retrieve all the data you care about, and in many cases any partially-written changes won’t prevent file system recovery; but you can’t guarantee that the image will be usable as-is, without repair, even on logged, journaling or copy-on-write file systems.

Stephen Kitt
  • 411,918
  • 54
  • 1,065
  • 1,164
  • Firstly, have many thanks for your answer. Secondly, if possible I would be happy to further inquire, if hence my thought, that modern filesystems (btrfs, xfs, ext4) employ journaling and COW explicitly to prevent an inconsistent (or worse "unrepairable") on disk representation is very incorrect? Indeed I thought that for instance in btrfs any change to an item of the filesyste (data/metadata) would at some point be comprised by a COW added new root checksumed state, to which it is switched atomicly. Hence from chances happened in btrfs from consistent to consitent state or not at all? – humanityANDpeace Feb 12 '21 at 16:41
  • 1
    It depends on what you mean exactly by consistent. In my mind, consistent storage is one where all the data that metadata expects is there, and nothing more. In COW file systems (or rather, ROW, but that’s being pedantic at this point), the future state is stored, and once ready, switched to; atomicity ensures that readers only ever see one state, never a transitory state, even across power failures, but that doesn’t mean that the storage itself is fully consistent (by my definition): with a power failure, part of the future state exists on disk, even if it will never be used. – Stephen Kitt Feb 12 '21 at 16:44
  • 1
    `FIFREEZE` ensures a fully consistent state, where all the metadata agrees with the data on disk, and there’s no extra data corresponding to a future state which was never reached. That is more than the guarantees offered by even COW file systems. – Stephen Kitt Feb 12 '21 at 16:45
  • (Note that none of this deals with *applications* view of data — so you can have a fully consistent storage image holding an inconsistent database, for example. Application checkpointing is a whole other kettle of fish; see VSS on Windows.) – Stephen Kitt Feb 12 '21 at 16:48
  • 1
    Once again thank you, this helped me understand it better. I see now that there are different levels of what is seen as "consistent". With my requirement to "being consistent on disk" would forgive any extra data which has not been yet commited in a atomic action fashion, as long as the on disk data would yield the preivious data+metadata lacking the next state. Honestly I did not even think of th extra data as something that made the disk image "inconsistent". – humanityANDpeace Feb 12 '21 at 17:07