18

I have a command <streaming ls> | wc -l, it works fine, but the <streaming ls> takes a while, which means I don't get the final line count until a few minutes later.

Is there a way to have the output of wc -l update in real time?

chicks
  • 1,112
  • 1
  • 9
  • 27
Foobar
  • 273
  • 2
  • 9

3 Answers3

32

You can’t use wc -l for this, but you can produce a running count of lines seen using other tools, for example AWK:

<streaming ls> | awk '{ printf "%d\r", NR } END { print NR }'

This will update the count of lines seen every time a line is seen, and finish with the total number of lines at the end of the process.

For commands producing lots of output, the overhead can be reduced by printing every n lines:

… | awk 'NR % 10 == 0 { printf "%d\r", NR } END { print NR }'

(for n = 10) or by printing every second:

… | awk 'systime() > lasttime { lasttime = systime(); printf "%d\r", NR } END { print NR }'

(or every n seconds by changing the condition to >= lasttime + n).

Stephen Kitt
  • 411,918
  • 54
  • 1,065
  • 1,164
  • 6
    If your input has a huge number of lines that come in fast, you can speed this up by only updating the count every 10 lines (`NR % 10 == 0 { printf ...}`), and printing the exact count at the end. Even more fancy would be to print an update when a line comes in only if it's been 100 ms since the last print, maybe with an `if()` inside the rule. But +1, this is a good simple starting point that's sufficient for some use-cases, like commands that produce lines somewhat slowly, or if terminal updates aren't a bottleneck. – Peter Cordes Nov 02 '22 at 19:19
  • 1
    I’ve added those variants to the answer, thanks. AWK (even GNU) doesn’t deal with sub-second intervals AFAICT, but every second or even every *n* seconds should be sufficient for a long-running job. – Stephen Kitt Nov 04 '22 at 10:57
  • Or print a result only when the current line count has increased by say 20%, so it will be fast to begin with and slow down over time. This would be useful for potentially huge inputs where one can't even estimate a reasonable size. Marking each line of output with a timestamp might be useful too. – Ray Butterworth Nov 04 '22 at 20:09
25

You could use pv to gives you some progress report:

cmd | pv -lbtr | wc -l
  • -l for line-based (reports the number of lines instead of bytes).
  • -b to report the number bytes (well lines here because of -l)
  • -t to report the time spent
  • -r to report the current rate (number of lines per second; see also -a for the average rate).

Beware the file names can be made of several lines, so wc -l on the output of ls is not guaranteed to give you a file count unless you use options like -b or -q which escape the newline characters in file names as \n or ?.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • 4
    While your final warning is technically correct, it’s a vanishingly small edge case because it’s exceedingly difficult for a regular person to accidentally create a file with such a name, and most people have no need for multi-line filenames. It’s important to keep such cases in mind when coding, but quite often they’re just not worth worrying about when working with a shell interactively. – Austin Hemmelgarn Nov 02 '22 at 22:09
  • 5
    @AustinHemmelgarn, no need for an accident, It's very easy to create those files voluntarily if you're inclined to exploit bugs in code that incorrectly assume file names can't contain newline characters. – Stéphane Chazelas Nov 03 '22 at 11:05
  • 3
    @AustinHemmelgarn Not really difficult, happens to me all the time. Example: I open a PDF from the web, want to save it under a recognizable name, so I just copy the title from the PDF and paste it into the Save As dialog. Unfortunately, if the title in the PDF is split across multiple lines, the embedded newlines get copied into the filename. – TooTea Nov 04 '22 at 15:31
  • 2
    `pv` is a really powerful tool. – Thorbjørn Ravn Andersen Nov 05 '22 at 13:31
3

Well I used to use something like watch -n 1 your command, not sure if that is of any use to your case, I am not a guru, just a first thing that came to my mind.

https://man7.org/linux/man-pages/man1/watch.1.html

watch - execute a program periodically, showing output fullscreen

-n, --interval seconds Specify update interval. The command will not allow quicker than 0.1 second interval, in which the smaller values are converted. Both '.' and ',' work for any locales. The WATCH_INTERVAL environment can be used to persistently set a non-default interval (following the same rules and formatting).

hocikto
  • 131
  • 2