2

I need to get a list of PDFs in various sub-folders of a directory structure and the number of pages of each PDF file. All saved into a CSV file, i.e. <filename>,<number_of_pages>.

I have used fd and qpdf to extract the number of pages recursively:

fd ".pdf" --type f -x qpdf --show-npages {/}

I tried to incorporate echo or printf in the command line to generate the CSV but without success.

Rui F Ribeiro
  • 55,929
  • 26
  • 146
  • 227
Marco
  • 23
  • 4

1 Answers1

2

Bear in mind that I just installed fd so this is my first experience with it.

Since it did not work as I first expected, I picked a different approach and piped the output from fd into a Read-While loop and assign it to variables.

fd -e pdf -x echo {} | while read -r line; do 
    var1="$line" && var2=$(qpdf --show-npages "$line"); 
    echo "var1,var2" > myfile.csv;
done

I should note that I have not RTFM-ed as well. ;)

Kurt Pfeifle
  • 1,401
  • 1
  • 12
  • 15
fragamemnon
  • 256
  • 1
  • 5
  • Thank you so much!!! Apart the odd `done` at the end of the loop and the usual double quotes to take into account PDF filenames with spaces, your solution was spot on! my updated version: `fd -e pdf -x echo {} | while read -r line; do var1=$line && var2=$(qpdf --show-npages "$line"); echo "$var1,$var2" >> myfile.csv; done` – Marco Nov 19 '18 at 11:46
  • Yes, sorry, I always forget to finish the loop. ;) Would the downvoters also elaborate on why they are in disagreement? I would like to learn the reason. – fragamemnon Nov 20 '18 at 12:05
  • @fragamemnon: I haven't downvoted your answer, but @Marco has already improved it a bit and fixed a glaring error (though not yet optimal). Here's my suggestion: ***`fd -e pdf -x echo {} | while read -r line; do var=$(qpdf --show-npages "$line"); echo "\"$line\",\"$var\"" >> myfile.csv; done`***. This will put each field into quotes inside the CSV. – Kurt Pfeifle Dec 17 '18 at 22:40