2

I have a bunch of tarball backups which I just restored onto my new Windows 8.1 + Cygwin system using GNU tar:

zsh$ for file in **/*.tgz; do tar xvzf $file; done

To my surprise a lot of these extracted files were corrupt. I tried replacing GNU tar with BSD tar and repeated the process, but the same files were still corrupt.

Then I tried extracting them with WinRAR, and they turned up just fine. Does anybody know what's going on?

Mark Boulder
  • 141
  • 3
  • 1
    Is there any pattern to the corruption (line endings, files too small, etc.)? Do you know what program created the archives (and on what platform)? – Mikel May 27 '14 at 05:33
  • Can you do a `diff -u <(od -An -vtx1 < f1) <(od -An -vtx1 < f2)` where f1 and f2 are the same file but extracted with tar and winrar? – Stéphane Chazelas May 27 '14 at 06:28
  • @StephaneChazelas actually the only corrupt files I can find are `.otf` and `.mp3` so I'm not sure what good a diff would do. What I said to @Mikel earlier about the text files was a false alarm. – Mark Boulder May 27 '14 at 06:54
  • 2
    The tar format has gone through a few iterations and supports vendor specific tags/extensions. With what commands were these `.tgz` created in the first place? – Anthon May 27 '14 at 07:05
  • Just `tar czf $file.tgz $folder` – Mark Boulder May 27 '14 at 07:10
  • That's a diff on the output of `od` to see what bytes differ, but I forgot the `-w1` option to add to `od`. – Stéphane Chazelas May 27 '14 at 07:32
  • @StephaneChazelas Hi! Unfortunately that command does not produce any output: `diff -u <(od -An -vtx1 < garamond_premier_pro.otf) <(od -An -vtx1 < garamond_premier_pro_corrupt.otf)` – Mark Boulder May 27 '14 at 14:44
  • That would seem to indicate the files are identical. Does `cmp -l file1 file2` give anything? Or possibly cygwin reads the files in some way that eol characters are converted on the fly so it can't detect the difference. – Stéphane Chazelas May 27 '14 at 14:53
  • `cmp` returns nothing as well. In Windows the first file opens fine, the second one returns: `The requested file is not a valid font file.` – Mark Boulder May 27 '14 at 14:56
  • Is there a CYGWIN env var? – Stéphane Chazelas May 27 '14 at 14:58
  • There's a bunch of them, why? – Mark Boulder May 27 '14 at 14:59
  • @StephaneChazelas do you know a way I can diff whole folders (ie. `extracted_tar` and `extracted_winrar`) with all their content? – Mark Boulder May 28 '14 at 02:08
  • @MarkBoulder: It's not `diff`, but the first option in [this answer](http://unix.stackexchange.com/a/35834/138) should be helpful. How exactly are you determining that these extractions are "corrupt"? – Warren Young Jul 22 '14 at 14:16

0 Answers0