1

This is baffling behaviour to me. I send a terminate signal to my script while aws s3 sync command is running, and despite me handling the sigterm error, the error trap is also triggered by aws sync command, and I don't understand why. To add to confusion, the command throws an error and continues:

Script:

#! /bin/bash

trap 'echo GOT ERROR, exiting' ERR
trap 'echo GOT SIGTERM!' SIGTERM
while true; do
    date +%F_%T
    aws s3 cp /vagrant/audio/ s3://testarchive/tester/ --recursive
    sleep 1
done

Command to run script:

timeout 5s ./tester.sh

Output:

upload: ../../vagrant/audio/2019-09-16/3/35322118-8264-406B-961B-EAF1FE7A34EF.wav to s3://testarchive/tester/2019-09-16/3/35322118-8264-406B-961B-EAF1FE7A34EF.wav
upload: ../../vagrant/audio/2019-09-16/1/165BD3D0-773A-4591-A43E-D67810716066.wav to s3://testarchive/tester/2019-09-16/1/165BD3D0-773A-4591-A43E-D67810716066.wav
upload: ../../vagrant/audio/2019-09-16/2/2A9559BB-168A-47D2-943A-A51B7885233B.wav to s3://testarchive/tester/2019-09-16/2/2A9559BB-168A-47D2-943A-A51B7885233B.wav
Terminated6.8 MiB/123.1 MiB (1.5 MiB/s) with 422 file(s) remaining
GOT ERROR, exiting
GOT SIGTERM!
2020-01-17_21:05:40
upload: ../../vagrant/audio/2019-09-16/0/07502A17-9304-4995-94E1-A1B0D439EEE7.wav to s3://testarchive/tester/2019-09-16/0/07502A17-9304-4995-94E1-A1B0D439EEE7.wav
upload: ../../vagrant/audio/2019-09-16/0/05E4C765-C2FA-4EC0-9803-8FF02C0FEDDE.wav to s3://testarchive/tester/2019-09-16/0/05E4C765-C2FA-4EC0-9803-8FF02C0FEDDE.wav
upload: ../../vagrant/audio/2019-09-

EDIT #2:

29   1   *   *   *   root   strace -e trace=kill timeout --foreground 6 /home/vagrant/tester.sh &> /home/vagrant/tester.log
#! /bin/bash

trap 'echo GOT ERROR..' ERR
trap 'echo GOT SIGTERM! && set_terminate_flag' SIGTERM

terminate_flag=false

function set_terminate_flag {
  terminate_flag=true
}

while true; do
  if [ "$terminate_flag" = true ]; then
    echo OMG IT WORKS!
    exit 0
  fi
  date +%F_%T
  aws s3 cp /vagrant/audio/ s3://testarchive/tester/ --recursive
  echo LOOP IS Done, begin sleep
done

output:

...(skip output, 6 seconds have passed!!!)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_value={int=14831664, ptr=0xe25030}} ---
kill(9432, SIGTERM)                     = 0
kill(9432, SIGCONT)                     = 0
...(skip output)
upload: ../vagrant/audio/2020-01-01/E7914F83-8A89-4679-ABBC-8DB261D13349-01.wav to s3://testarchive/tester/2020-01-01/E7914F83-8A89-4679-ABBC-8DB261D13349-01.wav
GOT SIGTERM!
LOOP IS Done, begin sleep
OMG IT WORKS!
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=9432, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 124 +++
alexfvolk
  • 163
  • 5

1 Answers1

2
trap 'echo GOT ERROR, exiting' ERR

Simply saying "exiting" doesn't mean it's also true ;-)

An ERR trap will executed whenever a command fails, no matter if the script exit immediately afterwards (eg. because of set -e) or not:

$ bash -c 'trap "echo error, not exiting yet" ERR; false; echo DONE'
error, not exiting yet
DONE

In your case the command which fails may be date or aws, but it's most probably sleep (which is an external, not a built-in command). sleep exits with a non-zero status (= failure, thence the ERR trap triggered) because it's also killed by the signal sent by timeout: timeout first sends a signal to its child, and then to the entire process group it's part of:

$ strace -e trace=kill timeout 1s bash -c 'echo $$; while :; do sleep 3600; done
'
4851
...
kill(4851, SIGTERM)                     = 0
kill(0, SIGTERM)                        = 0
...

The shell won't run any trap until after the foreground command it's waiting for has exited; and if timeout weren't signaling the whole process group (which can be achieved with the --foreground option), that foreground command may not exit, and the trap may not run:

$ timeout 1s bash -c 'trap "echo TERM caught" TERM; sleep 36000; echo DONE'
Terminated
TERM caught
DONE
$ timeout --foreground 1s bash -c 'trap "echo TERM caught" TERM; sleep 36000; echo DONE'
<wait and wait>
  • Haha, i did not even notice the exiting string, lol'ed. So is there no possible way to not have the currently running command in my script fail anytime timeout sends a terminate signal? That is really unfortunate, as I would have liked to have the s3 finish the current upload. – alexfvolk Jan 17 '20 at 23:52
  • Of course there is, run `timeout` with the `--foreground` option, as in the last example. –  Jan 17 '20 at 23:54
  • Thank you so much! – alexfvolk Jan 18 '20 at 00:16
  • Is there a way to do it through a cron job? – alexfvolk Jan 18 '20 at 00:33
  • Trap does not run when doing timeout --foreground in cron :( – alexfvolk Jan 18 '20 at 01:01
  • @alexfvolk works for me. take care to the cron's quirks and pitfalls (different `PATH`, special handling of the `%` character) –  Jan 18 '20 at 01:11