3

My goal is to calculate how long it takes to output text to audio when using the say command.

For example, say will speak in real time:

$ say -v Alex "Hello there"

I can then combine say with time to answer the question in the text, although we have wait until the end of the actual audio output:

$ time say -v Alex "Hello there. How long will this take?"

real    0m2.993s
user    0m0.006s
sys     0m0.009s
  1. Is there a way to calculate how long it will take to output any say command without actually executing it? How?
  2. If not, how can I use grep to pull out the real line?

I'm trying something like this:

time say -v Alex "Hello there. How long will this take?" | grep "^real   .*$"

But of course there is no result.

Is the output not being passed to grep, does grep not work for this multi-line output, or did I use the wrong pattern matching?

If grep won't work, what will?

UPDATE #1

Actually what I think I'm really looking for is the duration of the generated audio file that results from say.

slm
  • 363,520
  • 117
  • 767
  • 871
kraftydevil
  • 135
  • 4

1 Answers1

3

Timing a run of say

  1. Is there a way to calculate how long it will take to output any say command without actually executing it? How?

I see no way to accomplish this using any switches provided by the say command.

  1. If not, how can I use grep to pull out the real line?

To parse the time output you can do the following:

$ ( time say -v Alex "Hello there. How long will this take?" ) |& grep real
real    0m2.987s

Alternatively:

$ ( time say -v Alex "Hello there. How long will this take?" ) 2>&1 | grep real
real    0m2.987s

In the above we've wrapped the time ... command in a subshell and then redirected the STDOUT & STDERROR (|&) to grep. The 2>&1 form does the same thing in situations where |& doesn't work for your particular version of Bash.

/dev/null

Incidentally, if you use the -o <file> argument to say you can speed up the translation of text to audio. Here since we don't actually want the audio file, we're directing to /dev/null instead:

$ ( time say -v Alex "Hello there. How long will this take?" -o /dev/null ) |& grep real
real    0m0.310s

Alternatively:

$ ( time say -v Alex "Hello there. How long will this take?" -o /dev/null ) 2>&1 | grep real
real    0m0.283s

Notice how much faster it is when not having to utilize the speakers to do this operation, that's the delay in using the audio I/O. By directing to a file instead it's much more efficient.

Calculating the audio's duration

To determine the duration of the resulting say's audio file you can do the following:

$ say -v Alex "Hello there. How long will this take?" -o a.aiff && \
    ffmpeg -i a.aiff 2>&1 | grep Duration && rm a.aiff
  Duration: 00:00:02.85, start: 0.000000, bitrate: 364 kb/s

Here we can see that the duration of the resulting audio is 2.85 seconds.

Further improvements?

I looked into piping the output from say directly into ffmpeg but say apparently cannot do this. Others have come to the same conclusion per the Ask Q&A titled: How to pipe output of 'say' to another command.

References

slm
  • 363,520
  • 117
  • 767
  • 871
  • The problem with outputting to `/dev/null` is that since we no longer have to wait for the audio to be output in real time we are welcome to generate it as quickly as possible. – Ignacio Vazquez-Abrams Jul 19 '18 at 04:46
  • @IgnacioVazquez-Abrams - that's true, I wasn't sure if the OP wanted the time it took for `say` to actually do the translation of text to speech and utter it, or just the time to do the translation. – slm Jul 19 '18 at 04:48