0

I have a .tar file which takes up about 70% of my disk space and I would need to unpack it to the same disk. Does the tar command have an option to remove the files that have already been unpacked during the process so that they don't take up the space any more? The --delete option seems to be able to remove specific files from the .tar archive, but is there a way to tell tar to always remove the last one that has just been unpacked? This way as the unpacked files size increases, the .tar size would decrease all the way to zero in the end.

Moreover, if the process is interrupted, it could be continued from where it was left since only the unpacked files would still be present in the .tar.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
Botond
  • 135
  • 5

1 Answers1

2
#! /bin/bash
test ! -e "$1" && echo "Run with a tar archive" && exit
tar --list -f "$1" | tac | while IFS= read -r fname; do
    test "${fname: -1}" = '/' && continue # skip directories
    tar --extract -f "$1" "$fname" || exit 1 # let's stop in case we can't extract a file
    tar --delete  -f "$1" "$fname" || exit 2 # just in case
done
test "$?" = "0" && tar xf "$1" # restore directories timestamps

tac is used to reverse the order of files, so that tar doesn't need to rewrite the entire archive after deleting a file - tar only needs to truncate the archive. I've tested the script on a couple of files - it works fine, though it could be extremely slow for archives with a lot of small files.

"${fname: -1}" extracts the last symbol of a filename - if it's a slash, it's a directory, so we skip it. Check the comments for a portable version.

Artem S. Tashkinov
  • 26,392
  • 4
  • 33
  • 64
  • 1
    More obvious syntax to check whether a string ends in `/` in standard `sh` would be `case $string in (*/)...`. `echo -n` is not POSIX – Stéphane Chazelas May 27 '21 at 06:01
  • It's better practice to write error messages on stderr and exit with a failure status upon failure. Also note that the `|| exit` in `bash` will only exit the subshell that runs the loop, not the script. – Stéphane Chazelas May 27 '21 at 06:06