4

Disclaimer: I am a novice to Unix/Linux, but I am looking forward to learning! I have tried a search on this stackexchange and read the the man find, but I can't seem to figure this out.

I want to use the find ... -exec {} + command to recursively find all files with a particular file extension and run a command on the list of files. There are approximately 100k files that I need to convert. The command that I am running accepts the filename (or a list of filenames, eg f1 f2 f3) as a parameter, but I also need to specify additional parameters to run the command.

What I tried so far:

This works:

find . -iname "*.extension" -exec <command> {} <additional parameters> \;

This doesn't seem to work:

find . -iname "*.extension" -exec <command> {} <additional parameters> +

I get the error message, find: missing argument to '-exec'. I am guessing that I cannot specify additional parameters after the {}?

Some notes:

The command in question takes the filename as the first parameter, and then I need to designate some additional parameters, such as the output directory -o <outputDir> and the variables to extract from the files -v <var1,var2,...>.

I am running this on the terminal in Ubuntu 12.04, if that makes any difference.

Anthon
  • 78,313
  • 42
  • 165
  • 222
ialm
  • 143
  • 5
  • What command is `` ? You might want to fix its odd syntax first as it breaks the POSIX Utility syntax guidelines. (All options should precede operands on the command line.) – jlliagre Jun 11 '13 at 23:31
  • @jlliagre `` is to be replaced by the actual command I'm using, such as `ls` or `rm`. In my case, it is a tool that converts from one file format to another, and it does not actually have `<` or `>` in the call. – ialm Jun 12 '13 at 16:13

4 Answers4

4
find . -iname "*.extension" -exec sh -c '
  exec <command> "$@" <additional parameters>' sh {} +

See How does this find command using "find ... -exec sh -c '...' sh {} +" work? for details.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • Thanks, this worked for me! May I ask for an explanation of what is happening here? – ialm Jun 11 '13 at 20:45
  • This worked for a small subset of files that I was testing with, but now that I am trying this on the set of 100000 files, I get "set: Too many arguments." errors. I read that using `{} +` was faster than `{} \;`, but I guess I can't use it! Thanks for your answer, though! – ialm Jun 11 '13 at 22:05
  • @ialm, that's a limitation of `csh` (its builtins have a limit (1000 on the one found on Ubuntu) on the number of arguments), the shell that you must be using in a script called by ``. You could at least use `tcsh` (which should be backward compatible with `csh`), but best is to avoid `csh` at all for scripting. – Stéphane Chazelas Jun 12 '13 at 10:38
  • There is no evidence csh is involved. – jlliagre Jun 13 '13 at 08:13
  • @jilliagre, yes there's _"set: Too many arguments."_ which is a `csh` message and no Bourne-like shell `set` builtin would have this kind of limitation. `tcsh` could output it as well, but with numbers of arguments you're unlikely to reach. You can reproduce it with `csh -c 'set a=($argv)' {1..998}` – Stéphane Chazelas Jun 13 '13 at 10:18
  • 1
    @StephaneChazelas - I asked this A asking for someone to explain the above code: http://unix.stackexchange.com/questions/93324/how-does-this-code-work – slm Oct 02 '13 at 18:26
  • FWIW, I don't think the extra `exec` in the shell command is needed. – Joshua Skrzypek Jul 12 '22 at 17:42
  • 2
    @JoshuaSkrzypek, in some `sh` implementations, that saves a process. Some other `sh` implementations do the `exec` implicitly as an optimisation. – Stéphane Chazelas Jul 12 '22 at 17:55
1

With the + it's going to list multiple filenames separated by spaces in place of {} (and it will be a long list, since you have 100000 files) rather than just a single filename. That being the case, the {} is required to come at the end of the command.

See the find(1) man page under -exec command {} +.

bahamat
  • 38,658
  • 4
  • 70
  • 103
  • That is not a valid argument. There is no technical reason which forbids trailing arguments. The command line calculation would be nearly the same. The only reason is that this is a stupid limitation of both `find` and `xargs`. – Hauke Laging Jun 11 '13 at 20:00
  • @HaukeLaging: Take it up with the authors. I am simply stating what is. – bahamat Jun 11 '13 at 20:03
  • If there is not a language problem (and I misunderstand you) then you are not stating what is. This is: "Because of a design decision `{}` must come at the end." You say (in my understanding): "Because `{}` expands to many files `{}` must come at the end. And that is simply not true. – Hauke Laging Jun 11 '13 at 20:26
  • 1
    There's no _spaces_ coming into the picture there. `{}` is replaced with a list of arguments passed to the command, that's all. spaces in shell command line are used to separate argument to commands, but here, `find` doesn't start any shell. – Stéphane Chazelas Jun 11 '13 at 20:36
  • My man page on `-exec comman {} +` states `Only one instance of `{}' is allowed within the command.` but nothing about where `{}` should be placed. But when testing the command it does require that I place `{}` in the end. Weird. – Lii Apr 12 '14 at 18:33
1

Assuming all directories and files have regular names, i.e. not containing spaces, newlines or similar, this should work even with a huge number of files:

find . -iname "*.extension" -exec sh -c '
command="<command>"
additionalParameters="<additional parameters>"
h=$(($#/2))
cmd="$command "
for i in $(seq 1 $h);do
        cmd="$cmd $(eval echo \$$i) "
done
cmd="$cmd $additionalParameters"
$cmd
shift $h
$command "$@" $additionalParameters' sh {} +

Rationale:

When using the + punctuation, find builds a command as large as possible. There are two limitations involved, the maximum number of arguments allowed (should be 128k on Gnu/Linux) and the maximum size of the argument list (should be 2 MB on Gnu/Linux). The issue is the command called requires extra arguments (additional parameters). Adding them overflows the limit leading to the "too many arguments error". The script I suggest split the built parameter list in two parts and run two commands instead of one per block so adding extra arguments do not exhibit the issue.

jlliagre
  • 60,319
  • 10
  • 115
  • 157
  • Thank you for the answer! I opted to be patient and use the slower `\;` option, and the job should be done in a couple of days. If I ever need to run the job again, I will try this! – ialm Jun 12 '13 at 16:16
  • 1
    Answer updated to explain why I guess it failed with Stephane's script. – jlliagre Jun 13 '13 at 08:12
0

You can use this script:

#! /bin/bash

cmd=echo

test $# -gt 2 || exit 2
num_trailing_args="$1"
[[ $num_trailing_args =~ ^(0|[1-9][0-9]*)$ ]] ||
  { echo "Illegal first argument ('${num_trailing_args}'); aborting"; exit 2; }
test $# -lt $((num_trailing_args+2)) &&
  { echo "Too few arguments; aborting"; exit 2; }
shift
trailing_args=()
for((i=0;i<num_trailing_args;i++)); do
        trailing_args[i]="$1"
        shift
done

"$cmd" "$@" "${trailing_args[@]}"

and then use

find ... -exec args_change_script.sh 3 t1 t2 t3 {} +

The name of the command should not be longer than the name of the script (just to be sure).

Hauke Laging
  • 88,146
  • 18
  • 125
  • 174