Is there a command that combines `tee` and `grep` such that in a pipeline the tee part can direct matches to a file?

Question

I'm looking to pull multiple pieces from a file, or at least watch for pieces in a log file if they should occur, but I don't want to setup multiple tail | grep sessions. Instead I'd like to just tail the output of each grep.

I suppose I could do this with awk, but perhaps there is something closer to this idea already.

For example the command might look something like this:

tail -f /var/log/syslog | teegrep cron1 -f CRON | teegrep cloud-init2 -f CLOUD-INIT

The output would be the files CRON and CLOUD-INIT, while all things would go to stdout by the end. Only the matches for 'cron1' and 'cloud-init2' lines would wind up in their respective files.

Maybe it's already part of a command and I have just not known about it. Or possibly just some bash/zsh trickery that would do the same thing. Either way: Is there a command that combines tee and grep such that in a pipeline the tee part can direct matches to a file?

Possibly `tail -f /var/log/syslog | pee 'grep cron1 > CRON' 'grep cloud-init2 > CLOUD-INIT'` ? See ['tee' for commands](https://unix.stackexchange.com/a/371730/65304) — steeldriver, Apr 13 '20 at 16:35
Perhaps you ask for something similar I questioned some month ago [here](https://unix.stackexchange.com/questions/522914/can-i-duplicate-output-of-a-pipe) — schweik, Apr 13 '20 at 17:23
"I don't want to setup multiple `tail | grep`" That needs some rationale. A `teegrep` as you describe (which could be implemented with process substitutions as in @Kusalananda's answer or with an awk script), whould not use less resources. If anything, it will be slower, because the `tail | grep`s will run separately, without having to wait for each other as the "teegrep"s in a single pipeline. — , Apr 13 '20 at 19:20

Kusalananda · Answer 1 · 2020-04-13T20:23:52.440

Assuming a shell with process substitutions, >(...):

tail -f /var/log/syslog | 
tee >(grep cron1 >CRON) | 
tee >(grep cloud-init2 >CLOUD-INIT)

This would cause tail to produce data for the first tee in the pipeline. The tee would duplicate the data and send one copy across to the next tee and the other into a process substitution. That process substitution would run grep on the data arriving from tee and the result would go to the file CRON.

The last stage of the pipeline would work in a similar way.

In a shell where process substitutions are not available, e.g. /bin/sh, awk or sed could be used.

With awk:

tail -f /var/log/syslog | 
awk '{ print }
     /cron1/       { print >"CRON" }
     /cloud-print/ { print >"CLOUD-INIT" }'

With sed (note that we have to make sure that the output files don't exist, since the w command in sed always appends data to a file):

rm -f CLOUD CLOUD-INIT

tail -f /var/log/syslog | 
sed -e '/cloud/w CLOUD' \
    -e '/cloud-print/w CLOUD-INIT'

Is there a command that combines `tee` and `grep` such that in a pipeline the tee part can direct matches to a file?

1 Answers1