1

GNU find has a -print0 option to terminate filenames with null characters. However, this option is not available in POSIX find.

In the GNU man page for find, under the -print flag, it says:

If you are piping the output of find into another program and there is the faintest possibility that the files which you are searching for might contain a newline, then you should seriously consider using the -print0 option instead of -print.

This suggests to me that -print0 was introduced by GNU to specifically handle file paths with newline characters.

What alternative is available in POSIX for GNU's -print0 option, using either just POSIX find or piping to a second POSIX command?

Shane Bishop
  • 499
  • 2
  • 11
  • 1
    Generate that output POSIXly is not a problem. That's rather making use of it that would be a problem, given that those NULs make that output non-text and can't be processed by text utilities POSIXly. – Stéphane Chazelas Feb 05 '21 at 20:49
  • 2
    If you need to find a POSIX alternative to `-print0`, then I assume you will also need to find POSIX alternatives to handling that output? Why not just use `-exec` to process the pathnames directly? – Kusalananda Feb 05 '21 at 20:54
  • 1
    Does this answer your question? [How do I use find when the filename contains spaces?](https://unix.stackexchange.com/questions/81349/how-do-i-use-find-when-the-filename-contains-spaces) – Thomas Dickey Feb 05 '21 at 21:10
  • @ThomasDickey Perhaps that does answer my question. I was mostly looking to see if POSIX offered any way to do the same thing `-print0` does, but if `-print0` was designed specifically for the purpose of piping the output to `xargs -0` (which is also non-POSIX), then I guess there's no reason to try to find an alternative to `-print0` in POSIX. – Shane Bishop Feb 05 '21 at 22:40
  • 1
    To add to my last comment, from reading further, it seems like GNU might have introduced `-print0` to handle newline characters in paths (see my quote in my question). This (to me at least) makes it seem less likely that my question is a duplicate of [How do I use find when the filename contains spaces?](https://unix.stackexchange.com/questions/81349/how-do-i-use-find-when-the-filename-contains-spaces). Even if the answer to that question answers my question, the two questions IMO are different. – Shane Bishop Feb 06 '21 at 15:55

1 Answers1

2

find ... -exec sh -c 'printf "%s\0" "$@"' - {} +

Simply find ... -exec printf '%s\0' {} + may work too, though that will obviously use the standalone printf executable instead of the shell's builtin. I'm not sure if that may have other implications.

  • 2
    Given that the GNU "extension" `-print0` is usually used together with the non-portable and non-POSIX `xagrs -0`, there is no need to emulate `-print0`. But your idea is correct, `-exec +` is the solution to avoid the GNU xargs feature. – schily Feb 05 '21 at 19:31
  • Why is there a `-` before the `{}`? – Shane Bishop Feb 05 '21 at 19:35
  • @ShaneBishop it's for the `$0` variable -- you can set it to anything you want. –  Feb 05 '21 at 19:37
  • @user414777 The `$0` variable to which command? To `find`? To `sh`? To `printf`? – Shane Bishop Feb 05 '21 at 19:38
  • @user414777 In general, usually code only answers aren't helpful to readers who are trying to learn. In my case, I already understand `sh -c`, `-exec ... {} +`, `$@`, but others may not. It would be helpful for them if you provide some further explanation in your answer. – Shane Bishop Feb 05 '21 at 19:40
  • @ShaneBishop the `$0` variable inside the shell run with `sh -c 'commands ...'`. Check with `sh -c 'printf "%s\n" "$@"' - 1 2 3` vs `sh -c 'printf "%s\n" "$@"' 1 2 3`. With all due respect, you don't seem to already understand `sh -c` ;-). I'm sorry if you don't find my answer helpful, but there's no harm in it either. –  Feb 05 '21 at 19:43
  • 1
    Oh I see now - the `-` will prevent the first item going to `sh` from being treated (in a C sense) as `argv[0]`, which means all of the output of `find` will go to `$@` instead of the first one being lost. – Shane Bishop Feb 05 '21 at 19:51
  • 1
    Using things like `-` or `_` is bad practice those as what goes in there is also used by most shells when reporting error message for instance. It's better to use `sh` for instance, so you get an error message like `sh: line 0: printf: write error: Bad file descriptor` instead of `-: line 0: printf: write error: Bad file descriptor` – Stéphane Chazelas Feb 05 '21 at 20:46
  • I can't think of what benefit using that extra sh would bring. Note that some `sh` implementations (ksh88-based ones and some pdksh-based ones) still don't have `printf` builtin. – Stéphane Chazelas Feb 05 '21 at 20:51
  • If your `sh` is `yash`, that would choke on file paths not made of valid text in the locale for instance. – Stéphane Chazelas Feb 05 '21 at 20:54
  • @StéphaneChazelas I disagree that that is a "bad practice". For me it's very obvious, and it kind of mimics the errors of `perl -e`. –  Feb 05 '21 at 20:54
  • If you get a `Died at - line 1.` in `perl`, that tells you the `-` (stdin) script died (as in `perl <<< 'open "" or die'`). Not obvious, but makes sense. A `-: line 0: printf: write error: Bad file descriptor` error here at best misleads you into thinking a script fed on stdin failed. – Stéphane Chazelas Feb 05 '21 at 20:57
  • @StéphaneChazelas I was thinking of `syntax error at -e line 1`. Using `sh -c '...' -c args ...` would better mimic that, though that would be even *more* perplexing: "Why the two `-c`, are you trying to, etc". But feel free to change the answer as you see fit, I don't care about promoting my style here -- I'll make it community wiki. –  Feb 05 '21 at 21:03