-2

Why is pgrep needed? If we can use ps and grep together, why do we need pgrep? It'd be weird if we have a command lsgrep or curlgrep.

But one difference I noticed was, if we first start a tmux session with

tmux new -s foo

then

ps aux | grep tmux

won't be able to find the tmux server process, but

pgrep -l tmux

can. But still, why don't we have a flag with ps so that we can grep like pgrep does to be able to see the tmux server process? What are the differences between ps with grep and pgrep?

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
nonopolarity
  • 2,969
  • 6
  • 31
  • 41
  • I've already told you [why](https://unix.stackexchange.com/questions/578736/what-is-the-name-of-the-tmux-server-daemon#comment1077056_578736) `ps aux | grep` won't find your process. You have to adjust `ps`'s output format with `-o ....comm,...` if you want it to show the process name. –  Apr 08 '20 at 16:56
  • And no, you can't use `ps | grep` together because the output format of ps is NOT standardized. In addition to the fact that the grep may find _itself_. It's the same thing as with `ls | grep`: you don't use `ls | grep`, you use `find` (call it `lsgrep` if you want). –  Apr 08 '20 at 16:59
  • how can you make `ps -o` show the tmux server? How about `curlgrep`? In a way `ls` is more like showing directory content and `find` is to find something deep down (and report the exact path)... – nonopolarity Apr 08 '20 at 17:13
  • `ps -eo pid,comm | grep 'tmux.*server'` –  Apr 08 '20 at 17:14
  • thank you! I found `ps -e | grep tmux` works well too – nonopolarity Apr 08 '20 at 17:16

1 Answers1

2

The ps command has two fields that, generally, one searches in this way, the args and the comm. The first is the program argument string, NUL-delimited. The second is a "name" for the program. These are stored separately, and (on various operating systems) can both be altered by the program itself, at runtime. Programs such as tmux do indeed do that.

The output of ps is not machine parseable. Several fields can contain unencoded whitespace which makes it impossible to determine field boundaries reliably, because arbitrary length whitespace is also the field separator. args and comm are indeed two such fields. The output of ps is only human readable.

When you grep the output of ps you therefore are pattern matching entire lines, with no reliable way to anchor that pattern to the specific field concerned, except by eliminating pretty much everything else that is of any use, and that you might be trying to find by this method in the first place.

For examples:

% ps -a -x -e -o sid,comm,args |
  grep dbus-daemon |
  head -n 4
   25 nosh                cyclog dbus-daemon/ (nosh)
   25 dbus-daemon         dbus-daemon --config-file ./system-wide.conf --nofork --address=unix:path=/run/dbus/system_bus_socket
  989 dbus-daemon         dbus-daemon --config-file ./per-user.conf --nofork --address=unix:path=/run/user/JdeBP/bus
15107 grep                grep dbus-daemon
% 
% clearenv --keep-path \
  setenv WIBBLE tmux \
  ps -a -x -e -o sid,comm,command |
  grep tmux
15107 ps                  PATH=/usr/local/bin:/usr/bin:/bin WIBBLE=tmux ps -a -x -e -o sid,pid,comm,command
%

Put another way: grep is for operating upon text files comprising lines. The process table is not a text file, and treating it as if it were a text file (by translating it with the ps command) loses information about fields.

The way to perform such a search is to look at the process table with something other than ps. On Linux, one can look directly at /proc/${PID}/comm and the similar psuedo-files for the argument strings, environment strings, and so forth.

Or one can write a tool that fishes out the specific data to be matched from the process table, and that runs pattern maching on just that field alone. This tool is not for text files, but is for process tables. One can call it pgrep.

Of course, on the gripping hand one could write a ps whose output one can process with (say) awk, because it is machine readable, encoding whitespace with vis() and thus providing fields that awk can properly recognize. The downside is that then it is less human-readable and not quite what a conformant ps is supposed to be. I pass its output through console-flat-table-viewer to read it. ☺

% system-control ps -p 740 -o sid,comm,args
SID COMMAND COMMAND
25  dbus-daemon dbus-daemon\040--config-file\040./system-wide.conf\040--nofork\040'--address=unix:path=/run/dbus/system_bus_socket'
% 
% system-control ps -A -o sid,comm,args,envs,tree |
  awk '{ if ("dbus-daemon"==$2) print $3; }'
dbus-daemon\040--config-file\040./system-wide.conf\040--nofork\040'--address=unix:path=/run/dbus/system_bus_socket'
dbus-daemon\040--config-file\040./per-user.conf\040--nofork\040'--address=unix:path=/run/user/JdeBP/bus'
/usr/local/bin/dbus-daemon\040--fork\040--print-pid\0405\040--print-address\0407\040--session
% 
% system-control ps -A -o sid,comm,args,envs,tree |
  awk '{ if ("dbus-daemon"==$2) print $3; }' |
  unvis
dbus-daemon --config-file ./system-wide.conf --nofork '--address=unix:path=/run/dbus/system_bus_socket'
dbus-daemon --config-file ./per-user.conf --nofork '--address=unix:path=/run/user/JdeBP/bus'
/usr/local/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
%

Further reading

JdeBP
  • 66,967
  • 12
  • 159
  • 343
  • There are no `system-control` and `clearenv` commands on any of my systems, what are those? Of course the argv+envp block is simple memory that the process can write to (I can run a program like `int main(int ac, char **av){ memcpy(av[0], "FOO=bar", 8); putenv(av[0]); pause(); }` and have `ps` show its "environment"), but it needs more explanation why this is poignant and how it could happen with regular programs. –  Apr 08 '20 at 22:30
  • It's not that that is poignant. There's an explanation that `tmux` is one such program in the first paragraph. And you need to read the further reading, although I didn't originally put the manual page for `clearenv` there as it is fairly tangential. – JdeBP Apr 09 '20 at 08:14