Questions tagged [pdfgrep]
12 questions
3
votes
2 answers
Regex search in PDF reader
I am using zathura, as I enjoy its minimalist approach, but I would also switch to mupdf or anything else if this would solve my problem.
I need to highlight every word (in PDF and epub documents) one by one from start to finish because I can…
luca
- 142
- 1
- 9
3
votes
1 answer
Is there a tool for searching keywords super fast in many pdfs files?
I have a bunch of technical books,
and I have been using pdfgrep for a while,
but it takes substantial amount of time for searching all.
can somebody recommend me of a cli tool for searching in pdf files super fast?
it should have an underline…
JammingThebBits
- 426
- 4
- 13
3
votes
2 answers
How can I get the page numbers only of a pattern in a pdf file, regardless if the pattern is multiline?
I find the page numbers of a multiline pattern in a pdf file, by How shall I grep a multi-line pattern in a pdf file and in a text file? and How can I search a string in a pdf file, and find the physical page number of each page where the string…
Tim
- 98,580
- 191
- 570
- 977
2
votes
1 answer
Is there any ligature-aware alternative for "pdfgrep" in command line?
I always use "pdfgrep" to search inside of multiple PDF files from the command line. But I met a problem: This ligature character "fi" (see https://www.compart.com/en/unicode/U+FB01).
"fi" is in the word "fixed", so I could not search the term "fixed…
la la
- 21
- 2
2
votes
2 answers
Is there a way to search (grep/find) a specific word within multiple pdf files located on a specific drive?
I am trying to locate a client's pdf file that was saved on an external backup drive, which contains a little over 8000 pdf files and hundreds of folders.
For example, if I want to search all pdf files on drive X: that contains my client's name…
DiFrag
- 21
- 1
- 4
1
vote
1 answer
pdfgrep doesn't work with arabic langauge strings
I want to use pdf grep and it works when I want to search by an Arabic text or string. it shows nothing. however, it works properly when I search by an English string. Does anyone have a solution or even an alternative? Thank you
this is the code I…
VANMEN
- 11
- 1
1
vote
1 answer
Deep search of several pdf files with pdfgrep, ignoring counts less than
I am doing a "deep search" within several pdf files with "pdfgrep", trying to find a word and get a count on the documents like this:
# pdfgrep -ric PATTERN
./Example1.pdf:0
./Example2.pdf:10
Any idea how i can ignore the printout for files with…
Nils
- 113
- 3
1
vote
1 answer
How do I pdfgrep using a specific pattern (Syntax?)
I'm trying to use pdfgrep to search each occurences of a specific pattern (MUST start with E OR S) then followed by 5 digits (Only) THEN execute a command afterward (Which is likely to be a mv command)
So far, I have the following command :
pdfgrep…
ATragicEnding
- 11
- 1
- 3
1
vote
0 answers
How shall I grep a multi-line pattern in a pdf file and in a text file?
In the output of less my.pdf, a string image
not
available appears multiple times, for example:
... Lastly, what remains to
^L image
not
available
^L Implementations and Systems
I would like to grep the string in the pdf file, for…
Tim
- 98,580
- 191
- 570
- 977
0
votes
1 answer
Is it possible to integrate pdfgrep into nemo search?
I often find myself looking for PDF documents. Luckily, I found pdfgrep that really does a great job at finding PDF documents by content.
Following command lets me search for documents that have my search word on the first page
pdfgrep -irl…
0
votes
0 answers
Split pdf based on keyword
Is there an utility that would split PDF file based on keyword? I can only find split by pages (e.g. QPDF). I can also see pdfgrep, but I don't know whether this has been already combined in some other utility or not. I can write the bash script but…
Tomas Greif
- 349
- 1
- 4
- 12
-4
votes
2 answers
Can we search in a pdf file for pages containing several words in no particular order?
I would like to search in a pdf file for all the pages, each containing several given words in no particular order. For example, I want to find all the pages which contain both "hello" and "world" in no particular order.
I am not sure if pdfgrep …
Tim
- 98,580
- 191
- 570
- 977