News

pdfgrep 2.1.0 "Should Have Been Christmas" released

by Hans-Peter Deifel on April, 29 2018

After a year of waiting, pdfgrep 2.1.0 has finally been released. The tarball can be download on the download page. As always: Thanks to everyone who helped with this release.

pdfgrep contributors

This release is packed with new features that bring pdfgrep closer to parity with GNU grep:

New options: --files-with-matches/-l and --files-without-match/-L

These two related options open up new possibilities in scripting. Since they return only file names and not page number or matched text, their output can be used as input for other programs or even pdfgrep itself. As such, they are especially useful in combination with -Z.

For example, to search for PDFs in the current directory that don’t contain “foo” but contain “bar”, run:

pdfgrep -Z --files-without-match "foo" *.pdf | xargs -0 pdfgrep -H bar

To search for PDFs containing “rilz”, interactively select one with fzf and open it in the PDF viewer evince, do:

pdfgrep -RilZ rilz | fzf --read0 --print0 | xargs -0 evince

New option: --page-range

This allows to limit the search to certain pages. For example, to search for a PDF that contains “foo” on its title page, run:

pdfgrep --page-range 1 foo *.pdf

New options: --regexp/-e and --file/-f

Since its first release, pdfgrep only allowed to search for a single pattern. And while it’s possible to combine multiple search strings into a single regular expression using the | operator, this is fiddly to do in scripts. Now there are better options (pun intended).

The new --regexp argument can be specified multiple times and --file allows to directly provide a list of patterns in a file. Both can be mixed and all patterns are combined implicitly with OR.

Restructured Documentation

With more and more command line options, the manpage got a little unwieldy, so we split it up into multiple sections based on greps own manpage.