Bash Reuse Process Substitution File

Question

I have a big script which takes a file as input and does various stuff with it. Here is a test version:

echo "cat: $1"
cat $1
echo "grep: $1"
grep hello $1
echo "sed: $1"
sed 's/hello/world/g' $1

I want my script to work with process substitution, but only the first command (cat) works, while the rest don't. I think this is because it is a pipe.

$ myscript.sh <(echo hello)

should print:

cat: /dev/fd/63
hello
grep: /dev/fd/63
hello
sed: /dev/fd/63
world

Is this possible?

why don't you redirect the `$1` to temp file? `cat $1 >/tmp/tempfile` and use the temp file for rest of the work. — Prince John Wesley, Aug 10 '11 at 09:11

score 10 · Accepted Answer · edited Apr 13 '17 at 12:36

10

The <(…) construct creates a pipe. The pipe is passed via a file name like /dev/fd/63, but this is a special kind of file: opening it really means duplicating file descriptor 63. (See the end of this answer for more explanations.)

Reading from a pipe is a destructive operation: once you've caught a byte, you can't throw it back. So your script needs to save the output from the pipe. You can use a temporary file (preferable if the input is large) or a variable (preferable if the input is small). With a temporary file:

tmp=$(mktemp)
cat <"$1" >"$tmp"
cat <"$tmp"
grep hello <"$tmp"
sed 's/hello/world/g' <"$tmp"
rm -f "$tmp"

(You can combine the two calls to cat as tee <"$1" -- "$tmp".) With a variable:

tmp=$(cat)
printf "%s\n"
printf "%s\n" "$tmp" | grep hello
printf "%s\n" "$tmp" | sed 's/hello/world/g'

Note that command substitution $(…) truncates all newlines at the end of the command's output. To avoid that, add an extra character and strip it afterwards.

tmp=$(cat; echo a); tmp=${tmp%a}
printf "%s\n"
printf "%s\n" "$tmp" | grep hello
printf "%s\n" "$tmp" | sed 's/hello/world/g'

By the way, don't forget the double quotes around variable substitutions.

edited Apr 13 '17 at 12:36

Community

1

answered Aug 10 '11 at 22:41

Gilles 'SO- stop being evil'

807,993
194
1,674
2,175

storing unbounded data in a variable? – Stéphane Gimenez Aug 10 '11 at 22:51
@StéphaneGimenez I don't understand your comment. – Gilles 'SO- stop being evil' Aug 10 '11 at 23:47
sorry, I read the 2nd/3rd solutions but missed the condition "preferable if the input is small" which was actually written, but too far above. – Stéphane Gimenez Aug 11 '11 at 00:27
Thanks @Gilles. Any reason why you use <"$tmp" in your commands instead of just "$tmp" e.g. `grep hello "$tmp"`? Can I `cp "$1" "$tmp"` to create the tmp file instead of `cat`? – dogbane Aug 11 '11 at 07:53
@dogbane Mostly it's a matter of style. `<"$tmp"` makes it visually obvious that you're reading from the file, it's less clear with `cat "$tmp"` (which reads, whereas `tee "$tmp"` writes, and `cp "$a" "$b"` reads from `$a` and writes to `$b`). For `grep` there's a difference: `grep hello "$tmp" shows the name of the temporary file (which is useless).` – Gilles 'SO- stop being evil' Aug 11 '11 at 08:50
@Gilles `grep hello "$tmp"` would not show the name of the temp file. It would print out matching lines. – dogbane Aug 11 '11 at 09:56

score 4 · Answer 2 · edited Apr 13 '17 at 12:36

When you use a file, you can read its data many times. When you use a named pipe (what is actually created by process substitution), you can only read it once. So the grep and sed commands receive empty input.

(How to understand pipes might be a good reading.)

To so what you want to do with process substitution, you could write something like:

cat $1 | tee >(echo "cat: $1"; cat) | tee >(echo "grep: $1"; grep hello) | (echo "sed: $1"; sed 's/hello/world/g')

But in this case, the 2nd cat, grep and sed would be run in parallel, and their output interleaved. This might be more useful:

cat $1 | tee >(cat > cat.txt) | tee >(grep hello > grep.txt) | sed 's/hello/world/g' > sed.txt

Stéphane Gimenez · Answer 3 · 2011-08-10T09:50:26.163

2

The usual way to do this is to make the $1 parameter optional. Then, one can define FILE=${1-/dev/stdin} and use FILE several times. However reading several times on a pipe will read sequentially, data will not be duplicated.

The easiest solution to this issue would be to use some temporary file.

if [ -z "$1" ] ; then FILE=$(mktemp); cat >FILE; else FILE=$1; fi

If you wish to explicitly pass some filename (eventually /dev/fd/x), the same temporary file trick can be used:

FILE=$(mktemp); cat "$1" >FILE

You could also make complex use of tee to duplicate input from stdin filedescriptor to several other filedescriptors. But this last method would be quite heavy.

edited Aug 10 '11 at 09:50

answered Aug 10 '11 at 09:27

Stéphane Gimenez

28,527
3
76
87

1

$1 isn't empty. It is `<(echo hello)` which evaluates to a file in `/dev/fd`. – dogbane Aug 10 '11 at 09:37
Yes, I overlooked. This construct is not something like `<< – Stéphane Gimenez Aug 10 '11 at 09:54

score 0 · Answer 4 · answered Aug 10 '11 at 09:17

0

I file obtained by a process substitution is not seekable, depending on the underlying implementation, so you cannot read it more than once.

answered Aug 10 '11 at 09:17

enzotib

50,671
14
120
105

Bash Reuse Process Substitution File

4 Answers4