After a year of waiting, pdfgrep 2.1.0 has finally been released. The tarball can be download on the download page. As always: Thanks to everyone who helped with this release.
This release is packed with new features that bring pdfgrep closer to parity with GNU grep:
New options: --files-with-matches/-l
and --files-without-match/-L
These two related options open up new possibilities in scripting.
Since they return only file names and not page number or matched text,
their output can be used as input for other programs or even pdfgrep
itself. As such, they are especially useful in combination with -Z
.
For example, to search for PDFs in the current directory that don’t contain “foo” but contain “bar”, run:
pdfgrep -Z --files-without-match "foo" *.pdf | xargs -0 pdfgrep -H bar
To search for PDFs containing “rilz”, interactively select one with
fzf
and open it in the PDF viewer evince
, do:
pdfgrep -RilZ rilz | fzf --read0 --print0 | xargs -0 evince
New option: --page-range
This allows to limit the search to certain pages. For example, to search for a PDF that contains “foo” on its title page, run:
pdfgrep --page-range 1 foo *.pdf
New options: --regexp/-e
and --file/-f
Since its first release, pdfgrep only allowed to search for a single
pattern. And while it’s possible to combine multiple search strings
into a single regular expression using the |
operator, this is
fiddly to do in scripts. Now there are better options (pun intended).
The new --regexp
argument can be specified multiple times and
--file
allows to directly provide a list of patterns in a file. Both
can be mixed and all patterns are combined implicitly with OR
.
Restructured Documentation
With more and more command line options, the manpage got a little unwieldy, so we split it up into multiple sections based on greps own manpage.