3

I install a list of URLs in a text file named myurls:

http://www.examples.com/1
http://www.examples.com/2
http://www.examples.com/3

How shall I pass these URLs to wkhtmltopdf as inputs?

The direct way without using a file to store the URLs is

wkhtmltopdf http://www.examples.com/1 http://www.examples.com/2 http://www.examples.com/3 all.pdf

Maybe wkhtmltopdf has special requirements on its arguments, but I think my question may be more general than wkhtmltopdf: how to provide a list of (new line separated) strings stored in a file as a list of arguments to a command?

Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
Tim
  • 98,580
  • 191
  • 570
  • 977

4 Answers4

4

Try:

# disable shell filename generation (globbing) 
# and temporarily save applicable shell state
set -f -- "-${-:--}" "${IFS+IFS=\$2;}" "$IFS" "$@"

# explicitly set the shell's Internal
# Field Separator to only a newline 
eval "IFS='$(printf \\n\')"

# split command substitution into an
# arg array at $IFS boundaries while
# eliding all blank lines in myurls
wkhtmltopdf $(cat <myurls) allurl.pdf

# restore current shell to precmd state
unset IFS; set +f "$@"; eval "$1 shift 2"

That's extra cautious about restoring all shell state after possibly altering universally applied attributes. But the basic precept is just to set the shell's splitter in $IFS, to take care not to glob in case any of the command substitution's expansion includes [?*, and then to expand it unquoted into a list of arguments.

It can be done robustly much more simply in a subshell because you don't have to live with any after-effects:

(   set -f; IFS='
';  wkhtmltopdf $(cat) allurl.pdf
)   <myurls
Stephen Kitt
  • 411,918
  • 54
  • 1,065
  • 1,164
cuonglm
  • 150,973
  • 38
  • 327
  • 406
  • Ah, yeah, my mistake, need literal newline here. What do you mean about using tab? – cuonglm Jun 18 '15 at 15:26
  • Ah, of course. But the OP want a general solution- *how to provide a list of (new line separated) strings stored in a file as a list of arguments to a command?* - so I stick with limiting it to newline. – cuonglm Jun 18 '15 at 15:34
  • @mikeserv: Feel free to edit it for better approach, I'm not in my PC now. – cuonglm Jun 18 '15 at 15:44
2

Unfortunately it's not the case with wkhtmltopdf, but many commands provide an option to read arguments from a file (wget -i for example); that's the preferred approach where possible.

If whitespace in your file isn't important, command substitution works:

wkhtmltopdf $(cat myurls) all.pdf

Using xargs would also work with your example, but in general given what you're trying to do you'd need to ensure that it only runs wkhtmltopdf once; all.pdf will only contain the pages from the last run of wkhtmltopdf:

xargs -a myurls sh -c 'wkhtmltopdf "$@" all.pdf'

wkhtmltopdf does support an option to read arguments from standard input, --read-args-from-stdin, but that repeats executions, merging each line of standard input with the rest of the command-line arguments; so

wkhtmltopdf --read-args-from-stdin all.pdf < myurls

would be equivalent to

wkhtmltopdf http://www.examples.com/1 all.pdf
wkhtmltopdf http://www.examples.com/2 all.pdf
wkhtmltopdf http://www.examples.com/3 all.pdf

which isn't what you want (all.pdf will contain only the last site).

mikeserv
  • 57,448
  • 9
  • 113
  • 229
Stephen Kitt
  • 411,918
  • 54
  • 1,065
  • 1,164
  • thanks. in `myurls` file, the urls are separated by a new line character. When you specify `$(cat myurls)` to `wkhtmltopdf` will the new line characters be still present in the arguments? If not, why? – Tim Jun 18 '15 at 10:58
  • Newlines are considered word separators by default (so unless you've changed `IFS`), and they get removed during word splitting after the command substitution. Trailing newlines are removed entirely; see for example [the `bash` documentation on the topic](https://www.gnu.org/software/bash/manual/html_node/Command-Substitution.html). – Stephen Kitt Jun 18 '15 at 11:40
  • Thanks. In the command line `wkhtmltopdf $(cat myurls) all.pdf`, I find that the newlines used as seperators are replaced with white spaces after command substitution `$(cat myurls)`. Why are they replaced with whitespaces, while the bash manual only says they are removed during word splitting? – Tim Mar 06 '16 at 08:18
1

With xargs:

xargs -a myurls sh -c 'wkhtmltopdf $@ all.pdf'
FloHimself
  • 11,272
  • 3
  • 22
  • 24
0

Another approach:

wkhtmltopdf $(printf '%s ' $(<myurls)) all.pdf
jimmij
  • 46,064
  • 19
  • 123
  • 136