What is the difference between `fallocate --dig-holes` and `fallocate --punch-hole` in Linux?

Question

I totally understand that --dig-holes creates a sparse file in-place. That is, if the file has holes --dig-holes options removes those holes:

Let's take it in a very simplified way, let's say we have a huge file named non-sparse:

non-sparse:

aaaaaaaaaaaaaaaaaaaaaaaaaaaa
\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00
bbbbbbbbbbbbbbbbbbbbbbbbbbbb
\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00
cccccccccccccccccccccccccccc

non-sparse has many zeros in it, assume that the interleaving zeros are in Gigabytes. fallocate --dig-holes de-allocates the space available for the zeros (holes) where the actual file size remains the same (preserved).

Now, there's --punch-hole what does it really do? I read the man page, still don't understand:

-p, --punch-hole
              Deallocates space (i.e., creates a hole) in the byte  range
              starting at offset and continuing for length bytes.  Within
              the specified range, partial filesystem blocks are  zeroed,
              and  whole  filesystem  blocks  are  removed from the file.
              After a successful call, subsequent reads from  this  range
              will  return  zeroes.

Creating hole, that's the opposite of --dig-hole option it seems like that, and how come that digging a hole isn't the same as creating a hole?! Help! we need a logician :).

The naming of the two options are synonymous linguistically which perhaps makes confusion.

What's the difference between --dig-holes and --punch-holes operationally (not logically or linguistically please!)?

Stephen Kitt · Accepted Answer · 2017-09-04T17:48:51.103

--dig-holes doesn’t change the file’s contents, as determined when the file is read: it just identifies runs of zeroes which can be replaced with holes.

--punch-hole uses the --offset and --length arguments to punch a hole in a file, regardless of what the file contains at that offset: it works even if the file contains non-zeroes there, but the file’s contents change as a result. Considering your example file, running fallocate --punch-hole --offset 2 --length 10 would replace ten a characters with zeroes, starting after the second one.

Kusalananda · Answer 2 · 2017-09-28T23:18:11.840

In short:

--dig-holes makes a file sparse without modifying its contents (as seen by a program reading it).
--punch-hole creates a hole in a file, possibly modifying existing data.

The difference is that --dig-holes analyzes the file for areas that can be made sparse (using --offset and --length, if supplied, to indicate the range in the file to analyze), whereas --punch-holes uses --offset and --length to actually zero out a part of a file to create a hole.

Note also the plural "dig holes" vs. the singular "punch hole".

From the manual, regarding --dig-holes:

You can think of this option as doing a cp --sparse and then renaming the destination file to the original, without the need for extra disk space.

What is the difference between `fallocate --dig-holes` and `fallocate --punch-hole` in Linux?

2 Answers2