4

I am using the following command to retrieve the number of files which names contains sv or json in a given directory in a remote server:

nbs_files=`ssh -q -i ${sshkey} ${user}@${server} "find ${path}/ -maxdepth 1 -mindepth 1 -type f -name '*sv*' -o -name '*.json' -exec basename {} \; | wc -l"` 

This command returns only the number of .json files, whereas files with sv in their names exist in the ${path}.

When I remove the -o -name '*.json' part, the command works well, and returns the number of files containing the 'sv' in their names.

Does anyone know how can I modify the command in order to retrieve the files containing sv in their names and the files with the extension .json as well?

AdminBee
  • 21,637
  • 21
  • 47
  • 71
rainman
  • 149
  • 1
  • 5

1 Answers1

5

The problem is and/or operator precedence in the find expression. Specifically, the implicit AND between adjacent tests has higher precedence than the OR (-o) between the two name tests. So the test expression gets parsed as:

    -maxdepth 1 -mindepth 1 -type f -name '*sv*'
OR
    -name '*.json' -exec basename {} \;

...and since the -name '*.json' is the only one that's part of the same branch as -exec, the -exec only runs for json files.

The solution is to override the normal precedence with explicit parentheses around the -name tests:

nbs_files=$(ssh -q -i ${sshkey} ${user}@${server} "find ${path}/ -maxdepth 1 -mindepth 1 -type f '(' -name '*sv*' -o -name '*.json' ')' -exec basename {} \; | wc -l")

BTW, I also took the liberty of replacing the backticks with $( ) -- they're the more modern option, are easier to read, and don't have the same weird escaping anomalies that backticks have. See this question and BashFAQ #82.

Gordon Davisson
  • 4,360
  • 1
  • 18
  • 17
  • 2
    Of course one has to wonder what directory names the OP had to warrant running basename before counting the results. If there are no embedded newlines then removing the `-exec basename {} \;` will give the same results but without the cost of running a process per matching file. If the OP is using gnu find - likely given the `linux` tag, then `nbs_files=$(ssh -q -i ${sshkey} ${user}@${server} "find ${path}/ -maxdepth 1 -mindepth 1 -type f '(' -name '*sv*' -o -name '*.json' ')' -printf '%f\n' | wc -l")` will avoid the process per matching file even with embedded newlines in the directory names. – icarus Jul 05 '20 at 07:02
  • @icarus, though filenames with newlines will affect the count. You don't really need to print the actual filenames if all you're going to do is pipe to `wc`. Just something like `find ... -printf '\n' | wc -l` would do – ilkkachu Jul 05 '20 at 15:14
  • @ilkkachu The point is that to reproduce the behavior you do need to print the filenames, you can't just print a newline for each file. Note I am not saying that the behavior is the desired one, I strongly suspect it is not, but your solution answers a different problem. – icarus Jul 05 '20 at 18:33
  • @icarus, well, they did say "retrieve the number of files", so I thought they wouldn't want names with newlines to count as two. Or was there something else I missed? – ilkkachu Jul 05 '20 at 18:37