I am trying to redirect stdout of a command into two "branches" using tee for separate processing. Finally I need to merge results of both "branches" using paste. I came up with the following code for the producer:
mkfifo a.fifo b.fifo
python -c 'print(("0\t"+"1"*100+"\n")*10000)' > sample.txt
cat sample.txt | tee >(cut -f 1 > a.fifo) >(cut -f 2 > b.fifo) | awk '{printf "\r%lu", NR}'
# outputs ~200 lines instantly
# and then ~200 more once I read from pipes
and then in a separate terminal I start the consumer:
paste a.fifo b.fifo | awk '{printf "\r%lu", NR}'
# outputs ~200 once producer is stopped with ctrl-C
The problem is that it hangs. This behaviour seems to depend on the input length:
- If input lines are smaller (i.e. if second column contains 30 characters instead of 100) it works fine.
- If
a.fifoandb.fifoare fed with the same (or similar in length) input it looks like it also works fine.
The problem seemingly arises when I feed short chunks in say a.fifo and long in b.fifo. This behaviour does not depend on the order in which I specify pipes in paste.
I am not very familiar with Linux and its piping logic but it seems that somehow it deadlocks. My question is whether this can be reliably implemented somehow? If so, how? Maybe there are other ways without using tee and paste?