7

I'm trying to figure out a method to list all the programs that a script will use when it will be run, without actually running it.

I've written these quick and dirty oneliners:

# fill an array with all the useful words except variables, options, brackets, quotes
readarray -t list <<<$( grep -v '^#' script.sh | sed 's/[0-9a-zA-Z_\-]*=//g ; s/\${.*}//g ; s/\$(//g ; s/[)'\"\'\`']//g ; s/ --*.//g ' )

# for every word in array show info with `type' and clean the output again
for p in "${list[@]}" ; do type "${p}" ; done 2>&1 | grep -v -e '^bash:' -e 'shell keyword' -e 'shell builtin' | sort | uniq | sed 's/^.* //g ; s/[\(\)]//g'

I think the problems are:

  1. If the program it is not installed, `type' will fail
  2. Here documents can contain keywords that could be programs...
  3. If the script is not well written, the difficulty could increase (`shellcheck' could be useful)
  4. External configuration files and function libraries are not tracked (see ilkkachu comment)

Any better solution?

baselab
  • 621
  • 4
  • 14
  • 4
    Run it under `strace` and take note of all `exec()` calls? And then try to make sure you handle all possible code paths and all possible inputs... I don't think this can be done in general, since whatever the script does might depend on external configuration files and function libraries etc. – ilkkachu Dec 18 '17 at 10:37
  • 1
    @ikkachu or `strace -fe execve` – Stéphane Chazelas Dec 18 '17 at 10:50
  • @ilkkachu but that means actually running the script and the OP needs to do it "without actually running it". – terdon Dec 18 '17 at 10:57
  • 3
    @baselab : This type of problems is, in general, undecidable, so there is no way to solve it exactly. The best you can do is a heuristic approach - for instance collecting the first word in every line and pretending it is an external command -, but of course it is trivial to construct cases where you would miss a command, or take a word as a command which is not. And, even if you actually run the program (and use `strace`, as has been proposed), it shows only what has been used in this particular run. Other input data might cause other programs to be called. – user1934428 Dec 18 '17 at 11:07
  • Also there would be ways to trick the check with aliases and changing some env vars... I don't think it would be reliable to audit based on used words heuristics... – Zip Dec 18 '17 at 12:05
  • @baselab, actually, I think an important question is also "why?", as in "for what purpose you want to do this?" Are you trying to determine the software dependencies of a script; or trying to make sure it doesn't run anything dangerous/unwanted; or something else? – ilkkachu Dec 18 '17 at 12:05
  • @ilkkachu, actually both precautionary screening and check of requirements by exotic scripts (even my own). And of course, fun! :) – baselab Dec 18 '17 at 13:25
  • Can you change how `script.sh` is written, and enforce some local conventions? For example, always have two blanks lines for a line which calls an external program? One blank line for functions, none for aliases, ... – jalanb Jun 04 '18 at 23:22

2 Answers2

1

In post #16 of thread https://www.unix.com/shell-programming-and-scripting/268856-how-pre-check-scrutinize-all-my-shell-scripts.html I posted a 150-line perl script p1.txt that may be a useful starting point. I also added a link to a far-more-complete, complex, shell parser

It may be best to look at the entire thread -- perhaps some other viewpoints may also be of interest.

Best wishes ... cheers, drl

drl
  • 838
  • 7
  • 8
  • 1
    Thanks for the link and perl script! Yesterday I started to write a more complete parser based on my 2 oneliners, your links will be useful :) – baselab Dec 19 '17 at 09:04
  • 1
    You are welcome. I'll be interested in what you come up with ,,, cheers, – drl Dec 19 '17 at 14:44
  • @baselab I would also be interested in what you came up with – jalanb Jun 04 '18 at 23:20
0

This is no easy task. You have already identified several of the difficulties. Text based parsing is very error prone, very easy to bypass (if anyone wants to) and almost guaranteed to be incomplete.

Bash has a 'set' built-in function that will parse the script without actually executing it. This could help you but even that will be limited.

  [set] -n      Read commands but do not execute them.  This may be used
                  to  check  a  shell  script  for syntax errors.  This is
                  ignored by interactive shells.

To test it on a script just add set -n at the beginning of the script, then run it.

strace will be very useful but requires the script to actually run, which is something you want to do after being somewhat assured the script is safe.

Pedro
  • 1,821
  • 12
  • 23