0

I have come across the following strange behavior of GNU tar when using the --transform option to transform path elements: When I try to tar an entire sub-directory and want to transform the path to this directory, the transformation is not applied to the directory itself, but only to its content, when the transformation pattern explicitly contains the /.

To reproduce:

  • create a directory test-dir with dummy content:
    $ mkdir test-dir
    $ touch test-dir/test{1..50}.txt
    
  • tar this directory with renaming of test-dir/ to transformed-dir/, and instruct tar to print the transformed names for checking:
    $ tar --transform="s,^test-dir/,transformed-dir/," --show-transformed-names -cvf test.tar test-dir
    test-dir/
    transformed-dir/test25.txt
    transformed-dir/test29.txt
    transformed-dir/test47.txt
    ...
    

As you can see, the directory itself is not renamed correctly, although the renaming works for all files within the directory.

  • For comparison, use the same transformation but without the trailing /:
    $ tar --transform="s,^test-dir,transformed-dir," --show-transformed-names -cvf test2.tar test-dir
    transformed-dir/
    transformed-dir/test25.txt
    transformed-dir/test29.txt
    transformed-dir/test47.txt
    ...
    

Now, the directory itself is correctly renamed.

The behavior doesn't change when the ^ anchor is omitted, and is independent on whether the directory to be tarred is specified with or without trailing / on the command line.

  • I wondered if the problem was that when the / is specified, in case of the directory the entire filename is subject to replacement. However, when specifying a transformation that would rename an entire file, that works correctly:
    $ tar --transform="s,^test-dir/test29.txt,transformed-dir/file.txt," --show-transformed-names -cvf test3.tar test-dir
    test-dir/
    test-dir/test25.txt
    transformed-dir/file.txt
    test-dir/test47.txt
    ...
    

So it really seems that the trailing / is the problem. Is this a feature, a bug, or did I somehow misunderstand the scope/syntax of the option? The tar version is GNU tar 1.28.

Faheem Mitha
  • 34,649
  • 32
  • 119
  • 183
AdminBee
  • 21,637
  • 21
  • 47
  • 71

1 Answers1

2

As indicated in comments by @muru and @UncleBilly, the problem here was likely a mixture of a misconception and unfortunate output by GNU tar.

  • When run in "verbose" mode (-v), (GNU) tar will append a trailing / to all entries that refer to directories rather than files.
  • However, the / is of course not part of the actual directory name as stored in the filesystem data.
  • The transformation seems to apply to the "actual" name of the directory entry, which does not match the pattern some_name/ as the / is not part of the name.

So the seemingly inconsistent behavior of the --transform option results from the difference in the printed directory name vs. the actual (and internally used) directory name.

As noted by @UncleBilly, if you want to ensure that only directories which fully match the specified name should be renamed, a transform statement like

--transform='s,^path/to/dir\($\|/\),newname\1,'

will ensure that only the directory path/to/dir (where the end-of-string anchor $ applies) and its contents path/to/dir/fileXXX.yyy (where dir is immediately followed by a /) are renamed. The back-reference \1 in the replacement text will ensure that the / is not omitted when transforming the path names of files inside the directory.

AdminBee
  • 21,637
  • 21
  • 47
  • 71