The following shell command was expected to print only odd lines of the input stream:
echo -e "aaa\nbbb\nccc\nddd\n" | (while true; do head -n 1; head -n 1 >/dev/null; done)
But instead it just prints the first line: aaa.
The same doesn't happen when it is used with -c (--bytes) option:
echo 12345678901234567890 | (while true; do head -c 5; head -c 5 >/dev/null; done)
This command outputs 1234512345 as expected. But this works only in the coreutils implementation of the head utility. The busybox implementation still eats extra characters, so the output is just 12345.
I guess this specific way of implementation is done for optimization purposes. You can't know where the line ends, so you don't know how many characters you need to read. The only way not to consume extra characters from the input stream is to read the stream byte by byte. But reading from the stream one byte at a time may be slow. So I guess head reads the input stream to a big enough buffer and then counts lines in that buffer.
The same can't be said for the case when --bytes option is used. In this case you know how many bytes you need to read. So you may read exactly this number of bytes and not more than that. The corelibs implementation uses this opportunity, but the busybox one does not, it still reads more byte than required into a buffer. It is probably done to simplify the implementation.
So the question. Is it correct for the head utility to consume more characters from the input stream than it was asked? Is there some kind of standard for Unix utilities? And if there is, does it specify this behavior?
PS
You have to press Ctrl+C to stop the commands above. The Unix utilities do not fail on reading beyond EOF. If you don't want to press, you may use a more complex command:
echo 12345678901234567890 | (while true; do head -c 5; head -c 5 | [ `wc -c` -eq 0 ] && break >/dev/null; done)
which I didn't use for simplicity.