25

From C, what's the easiest way to run a standard utility (e.g., ps) and no other?

Does POSIX guarantee that, for example, a standard ps is in /bin/ps or should I reset the PATH environment variable to what I get with confstr(_CS_PATH, pathbuf, n); and then run the utility through PATH-search?

Jeff Schaller
  • 66,199
  • 35
  • 114
  • 250
Petr Skocik
  • 28,176
  • 14
  • 81
  • 141
  • 1
    I have in the back of my head that POSIX says, for a number of commands, among them [ed(1)](http://www.mirbsd.org/man1/ed) (which is important for [mksh](http://www.mirbsd.org/mksh.htm)), that, _if_ they are available, they also _must_ be reachable under `/bin`, i.e. `/bin/ed` must be usable if ed is installed. I can’t find it right now, but I know LSB depends on it, and I’ve successfully defended bugreports using that as rationale, so it must at least have been true at some point. (Or it was something other than POSuX and I misremember, but the rest is true.) – mirabilos Sep 04 '19 at 21:28

2 Answers2

35

No, it doesn't, mainly for the reason that it doesn't require systems to conform by default, or to comply to only the POSIX standard (to the exclusion of any other standard).

For instance, Solaris (a certified compliant system) chose backward compatibility for its utilities in /bin, which explains why those behave in arcane ways, and provide POSIX-compliant utilities in separate locations (/usr/xpg4/bin, /usr/xpg6/bin... for different versions of the XPG (now merged into POSIX) standard, those being actually part of optional components in Solaris).

Even sh is not guaranteed to be in /bin. On Solaris, /bin/sh used to be the Bourne shell (so not POSIX compliant) until Solaris 10, while it's now ksh93 in Solaris 11 (still not fully POSIX compliant, but in practice more so than /usr/xpg4/bin/sh).

From C, you could use exec*p() and assume you're in a POSIX environment (in particular regarding the PATH environment variable).

You could also set the PATH environment variable

#define _POSIX_C_SOURCE=200809L /* before any #include */
...
confstr(_CS_PATH, buf, sizeof(buf)); /* maybe append the original
                                      * PATH if need be */
setenv("PATH", buf, 1);
exec*p("ps"...);

Or you could determine at build time the path of the POSIX utilities you want to run (bearing in mind that on some systems like GNU ones, you need more steps like setting a POSIXLY_CORRECT variable to ensure compliance).

You could also try things like:

execlp("sh", "sh", "-c", "PATH=`getconf PATH`${PATH+:$PATH};export PATH;"
                         "unset IFS;shift \"$1\";"
                         "exec ${1+\"$@\"}", "2", "1", "ps", "-A"...);

In the hope that there's a sh in $PATH, that it is Bourne-like, that there's also a getconf and that it's the one for the version of POSIX you're interested in.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • So what do you do for #!? – Joshua Sep 03 '19 at 20:58
  • 15
    @Joshua: You pray that `/usr/bin/env` exists and is mostly POSIX-compliant. – Kevin Sep 03 '19 at 21:04
  • *`#define _POSIX_C_SOURCE=200809L /* before any #include */`* Only if you keep it that simple on Solaris. Solaris quite strictly follows the POSIX standards in its use of various `#define` values, meaning if you set any of `_XOPEN_SOURCE`, `_XOPEN_VERSION`, or `_XOPEN_SOURCE_EXTENDED`, you can wind up with an earlier version of the POSIX API exposed no matter what you define `_POSIX_C_SOURCE` to. The best explanation is in [the Illumos `sys/feature_tests.h` file](https://github.com/illumos/illumos-gate/blob/4e0c5eff9af325c80994e9527b7cb8b3a1ffd1d4/usr/src/uts/common/sys/feature_tests.h#L264) – Andrew Henle Sep 03 '19 at 22:30
  • (cont) and the follow-on code itself. A cursory examination of a Solaris 11.4 `sys/feature_tests.h` shows what looks like identical code. For example, see how Mono mucks it up: https://unix.stackexchange.com/questions/507655/how-to-install-net-mono-on-solaris-11-source-code-compile – Andrew Henle Sep 03 '19 at 22:32
  • 3
    @Kevin or you familiarise yourself with the quirks of your palaeo-unix and adjust the #! line to use the correct path. – cas Sep 04 '19 at 01:43
  • @Joshua, shebangs are not POSIX, but if you execute a shebang-less script with a POSIX `exec*p()` or a POSIX `env`/`awk`/`vi`/`system()`... in a POSIX environment, it will be interpreted by a POSIX shell (in theory). Or you could add Bourne-compatible code at the start of your `#!/bin/sh -` script that reexecutes itself with a POSIX sh if it detects it's running under a Bourne shell, or you could write your script in the ancient Bourne syntax... – Stéphane Chazelas Sep 04 '19 at 08:59
  • 5
    @Kevin: No. `/usr/bin/env` is an **even less portable** (in practice) hack than `/bin/sh`. Per POSIX, the portable way to write a shell script is **with no `#!` at all**. If a file is executable but `ENOEXEC` (not a valid binary), `execvp` is to execute it via the standard shell. :-) Of course in practice this is a bad idea and you should just use `#!/bin/sh`. – R.. GitHub STOP HELPING ICE Sep 04 '19 at 13:52
  • @StéphaneChazelas You're not generally wrong, however, per the standard with an `extern char **environ` set properly (with a compliant `_CS_PATH`) passed to `execve` and friends, one needn't "hope that there's a sh in $PATH, that it is Bourne-like, that there's also a getconf and that it's the one for the version of POSIX you're interested in"; it is mandated that if you call `sh` in such circumstances, that it will call a POSIX compliant `sh`, and that therein `getconf` is compliant as well. Otherwise, it doesn't adhere to the standard. See answer below. – Geoff Nixon Sep 04 '19 at 14:32
  • 2
    @GeoffNixon, that part you're refering to is an alternative for when you don't, can't or don't want to use _POSIX_C_SOURCE. It does the setting of `$PATH` from the shell instead of from C. – Stéphane Chazelas Sep 04 '19 at 15:04
  • @StéphaneChazelas Yep. That's generally what I do these days. – Geoff Nixon Sep 04 '19 at 15:20
  • Note that `#!` was never meant to provide portability; it's a hack to allow an executable to be written in something other than raw machine code. It's the job of the *installer* to set it correctly for the host on which the executable is installed. – chepner Sep 05 '19 at 12:59
4

Actually, I would largely answer yes. POSIX does guarantee:

  1. That there is an absolute path a to standards-compliant version of each specified utility,
  2. And, that you must be able to find this absolute path, and be able to execute this utility.

Though it is not necessarily guaranteed that each utility shall be in a particular directory across all systems (/bin/ps), it always guaranteed to be able to be found in the system default PATH, as an executable file.

Indeed, the only standard-specified way to do this in the standard is (in C) via unistd.h's _CS_PATH, or in the shell, via a combination of command and getconf utilities, i.e., PATH="$(command -p getconf PATH)" command -v ps must always return the unique absolute path of the POSIX-compliant ps supplied on a particular system. That is, while it is implementation-defined which paths are included in the system default PATH variable, these utilites must always be available, unique, and compliant, in one of the paths specified therein.

See: <unistd.h>, command.

Geoff Nixon
  • 254
  • 1
  • 5
  • 1
    But for sh, there's a chicken and egg problem. That `PATH=$(command -p getconf PATH)` will only work from a POSIX shell in a POSIX environment. POSIX doesn't specify how you get into that environment, just that it be documented. For instance, on Solaris, you have a `/usr/xpg4/bin/getconf` and a `/usr/xpg6/bin/getconf` which would return different values for `_CS_PATH` for the two different versions of the standard and neither `/usr/xpg4/bin` nor `/usr/xpg6/bin` are in the default value of `$PATH`. There is a `/usr/bin/getconf` which IIRC gives you XPG4 conformance. – Stéphane Chazelas Sep 04 '19 at 14:43
  • Is that true for even for Solaris 11+ (UNIX 03+ certified) versions? I've always read ``` Applications... should be determined by interrogation of the PATH returned by getconf PATH, _ensuring_ that the returned pathname is an absolute pathname and not a shell built-in. For example, to determine the location of the standard sh utility: command -v sh On some implementations this might return: /usr/xpg4/bin/sh ``` to mean this must be an entry to a POSIX compliant `sh` from any default shell. – Geoff Nixon Sep 04 '19 at 15:05
  • 1
    There's nothing in POSIX that says that there should be a `getconf` command in the default `$PATH` of a given system. For instance, getting a POSIX environment may involve starting an emulation layer, without which you wouldn't run any Unix-like command at all (think Windows for instance). Once you are in a compliant environment, `getconf PATH` will get you a `$PATH` to get to compliant utilities, but if you were in a POSIX environment, that was probably already the case. Note that `getconf ps` may return `ps`. Having `ps` builtin is allowed. – Stéphane Chazelas Sep 04 '19 at 15:24
  • @StéphaneChazelas I really don't follow your argument—I mean its logic is sound, but tautologically meaningless. No, POSIX does not guarantee the existence of `geconf` in a default `$PATH` of ANY given system. And forget Windows... Mac OS is Certified UNIX, and it requires opening a Terminal [Emulator] for these. But that's not a limition on POSIX. POSIX doesn't guarantee any existence of anything in any other "given system" but within POSIX... life, spacetime, God, MSDOS. It specifies behavior of systems that conform to POSIX. Yes, only once in a compliant environment—but thats no limitation. – Geoff Nixon Oct 06 '20 at 13:52