4

I have a bash script containing a group of commands in curly braces { ... }. This group contains some initial echo commands and then one loop. At each iteration the loop executes various slow commands (basically with curl and some extra parsing). Each iteration is slow (because of network interaction) but it prints one line (of python code); as far as I can see, there should be no buffering issue coming from the commands themselves because they terminate their job and leave.

The whole group of commands is piped to python -u (I also tried with tail -f in order to check) and obviously the whole loop is executed before anything is read by python -u or tail -f.

I know how to unbuffer (when possible) one command with various tools like stdbuf but I don't think it can help here because it looks like the issue comes from the command-grouping rather than from such or such command.

Any hint?

Thomas Baruchel
  • 1,143
  • 1
  • 8
  • 14

2 Answers2

5

(Note to future readers: the tone of exasperation here is not for the question, but for the mistakes I made trying to answer it and the multiple edits they entailed.)

Oh, for pity's sake. The problem is in tail -f. This works just fine:

#!/bin/bash
printf 'hi\n'
{
    for i in 1 2 3 4; do
        sleep 0.5
        /bin/echo $i
    done;
} | cat
printf 'bye\n'

It's not the pipe, it's not the group. It's tail. As in, chasing our own tails!

So, tail -f failed because it doesn't output right away for some reason. Not sure why python -u is failing, but I don't think it's anything in the script. Maybe try unbuffer with it. Try your script with cat, at least, and verify that it's unbuffered in that case.


Earlier failed attempt intentionally left here so future readers can make sense of the comments.

This script exhibits the same kind of buffering problem you're getting:

#!/bin/bash
printf 'hi\n'
{
    for i in 1 2 3 4; do
    sleep 0.5
    printf '%s\n' $i
    done;
} | tail -f
printf 'bye\n'

This one does not. Output inside the group is redirected to stderr, then stderr from the whole group is piped to the command. Since it's stderr, it's unbuffered.

#!/bin/bash
printf 'hi\n'
{
    for i in 1 2 3 4; do
    sleep 0.5
    printf '%s\n' $i 1>&2
    done;
} |& tail -f
printf 'bye\n'

Adapted from Wang HongQin's answer in this question. The difficulty was in finding a way to unbuffer the pipe with braces rather than an explicit command. Had to fiddle around a while to get the redirection working properly.

Tom Zych
  • 923
  • 8
  • 17
  • Hummm... You actually see very well the issue. I see the idea (and I also read the original post by Wang HongQin), but after having tried your very own example by using `printf 'print %s\n' $i` and by replacing `tail -f` by `python -u`, it looks like the `print` statements are getting displayed without being read by python. I don't think the output is really piped, though it _looks_ like it is (since `tail -f` is supposed to print something on the screen which happens to be the case here). I am pretty sure that `tail` in your example doesn't get anything. – Thomas Baruchel Nov 28 '15 at 16:47
  • Yes, sorry, stuff was going out via stderr. I realized it after I posted and had to delete while I worked it out. Revision should work. – Tom Zych Nov 28 '15 at 16:48
  • Don't be sorry; I learn much here ;-) But I still don't think that the new version actually works :-( Could you try with `python -u` and by replacing your four numbers by a valid python `print` statement? – Thomas Baruchel Nov 28 '15 at 16:50
  • Sigh, no, you're right. I had turned off the `sleep` for testing and forgot to make sure that output was unbuffered. I have RL stuff now and can't work on this. I'll leave it up, maybe someone else can start from here and work out the bugs. – Tom Zych Nov 28 '15 at 16:56
  • these examples are not at all pertinent to the question, unless your `printf` is `/bin/printf` *(though i would doubt it even in that case)*. the shell doesn't do output buffering in the same way most programs do, and `sleep` doesn't buffer writes at all because it doesn't do any. – mikeserv Nov 28 '15 at 20:33
  • @mikeserv: The `sleep` was just to delay things so I could tell whether it was buffering or not. The `printf` issue occurred to me too, so I changed it to `/bin/echo` for future testing. Haven't solved it yet, though. I don't see a way to pipe `stderr` directly and I suspect this whole approach is unworkable. Trying something else now. Using `stdbuf -o0 /bin/echo` failed too. – Tom Zych Nov 28 '15 at 23:40
  • the problem is not in `tail -f` *exactly* - it does what you'd expect for a program that has to read the same file over time even though its already reached end of file over and over - it loops over it. obviously its not going to check the file for new information *all of the time* - that would be terribly wasteful - the typical `tail -f` implementation checks every 60 seconds. – mikeserv Nov 29 '15 at 02:47
  • You are perfectly right with your `cat` proof; thus my issue obviously comes from the behaviour of `python -u` which doesn't seem to parse its stdin as unbuffered. – Thomas Baruchel Nov 29 '15 at 10:39
  • I am not sure that the code with `cat` "works" for the reason you think. This code `{ echo foo >> mylog; echo bar; echo foo >> mylog } | cat >> mylog` creates `mylog` with lines foo, foo, bar, indicating that what was piped to `cat` was buffered. On the other hand, `{ echo foo >> mylog; echo bar; sleep 0.5; echo foo >> mylog } | cat >> mylog` produces `mylog` with lines foo, bar, foo. This suggests that somehow `sleep` flushes the buffer. – user102008 Nov 22 '22 at 21:45
1

you just have to do:

{   stdbuf -o0 curl ...
    stdbuf -o0 whatever ...
}|  tail -f

...which will work for dynamically linked applications, though i'm pretty sure curl includes its own unbuffer switch of some kind.

mikeserv
  • 57,448
  • 9
  • 113
  • 229