6

All the target files have been deleted. Of course, when I try to run any deletes again, the files aren't there to delete. Sorry to take your time.


I'm using bash on cygwin.

I have the output of fdupes in a file. I'm grepping the output to exclude a directory I want to keep intact, and wanting to delete the rest of the files listed.

I have some entries with spaces:

./NewVolume/keep/2009/conference/conference/conference 004.jpg

Which trips up xargs:

$ cat real-dupes.txt |xargs rm {}
...
rm: cannot remove ‘2009/conference/conference/conference’: No such file or directory`

When I try the -0 switch, it looks like the lines get globbed together:

$ cat real-dupes.txt |xargs -0 rm
xargs: argument line too long

Other questions have answers where the asker is adviced to use find to feed the arguments into xargs. That's not helpful in my scenario, because I don't believe that I can easily use find to identify the duplicates I want to get rid of. Also, the fdupes job ran some 12+ hours, so I really want to use this data set.

As far as I know, fdupes cannot exclude a directory from its automated delete, so I can't use it out of the box, either.

Rui F Ribeiro
  • 55,929
  • 26
  • 146
  • 227
user394
  • 14,194
  • 21
  • 66
  • 93
  • If it's *just* spaces you should be able to handle it with: `sed 's/ /\\ /' real-dupes.txt | xargs rm {}`. This won't handle the many other special characters that can be present in filenames, though; for some examples: `*$"'` See [this question](http://unix.stackexchange.com/q/131766/135943) for more details. – Wildcard Dec 29 '15 at 23:04
  • @Wildcard I got the same errors with your sed solution: `rm: cannot remove ‘2009/photos’: No such file or directory` -- there may be other characters that are problematic, but the majority of the errors thrown are from this apparent space. Pretty sure I used the normal space bar to name these files, although some originated on NTFS filesystems. – user394 Dec 29 '15 at 23:07
  • I mis-diagnosed the problem I was experiencing and all the answers are superfluous. – user394 Dec 29 '15 at 23:13
  • So, what, the `sed` solution deleted the files that remained and had spaces in them? – Wildcard Dec 29 '15 at 23:20
  • That could be, but I tried a number of things with the same error set (Can't delete files), so I'm not sure which one was the first to get it right. – user394 Dec 29 '15 at 23:43

2 Answers2

5

xargs -0 is not working because it expects a null-terminated string, which it does not find, therefore reading up all the input and globbing its argument.

Just convert every new line (I suppose there is a filename per line) with a \0, like this:

cat real-dupes.txt | tr '\n' '\0' | xargs -0 rm {}
Kira
  • 4,727
  • 3
  • 17
  • 33
4

By default xargs splits at whitespace. You may be able to use the non-standard -0 option to split at character \000, but your input has to be prepared to match the expectation. (find ... -print0 is one of the ways of doing this - assuming your version of find has the -print0 option.)

Provided that none of your files contains a new line in their name (i.e. you have one file per line) you can use xargs like this:

xargs -I{} rm {} <real-dupes.txt

The man page for xargs has this to say about the -I flag:

-I replace-str Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not terminate input items; instead the separator is the newline character. Implies -x and -L 1.

roaima
  • 107,089
  • 14
  • 139
  • 261
  • The `-0` option to `xargs` is a [non-standard extension](http://pubs.opengroup.org/onlinepubs/009604599/utilities/xargs.html), as is the `-print0` extension to [`find`](http://pubs.opengroup.org/onlinepubs/009695399/utilities/find.html). – Andrew Henle Dec 29 '15 at 23:25
  • 1
    @AndrewHenle the OP had first suggested `-0` so it seemed reasonable to reference it in my answer. – roaima Dec 29 '15 at 23:59
  • 1
    It is perfectly reasonable, and quite appropriate, but I think for completeness it's also important to note non-standard extensions so someone who runs across your answer in the future doesn't wonder what's wrong with their version `xargs`. – Andrew Henle Dec 30 '15 at 00:52