Showing posts with label ocr. Show all posts
Showing posts with label ocr. Show all posts

Sunday, October 29, 2017

No more ExactImage

I've grown tired of trying to package ExactImage for Fedora as the Enlightenment bindings it uses seem to no longer be packaged. As of 0.30.0, pdfocr uses the easier-to-package hocr-tools library for wrangling hocr files. The copr repository has been updated accordingly.

Wednesday, March 18, 2015

Quick and dirty PDF OCR in Fedora 21

Every so often you wind up with a scanned PDF of mostly text. Performing OCR allows you to both search and select text in the document. Usually for one-off scans, a quick OCR is sufficient (no training, no skew correction, etc.). For those cases, I usually recommend pdfocr. Unfortunately for Fedora users, pdfocr uses PDFtk (which Fedora does not package due to it's dependence on gcj), and ExactImage, which no longer has a maintainer. Read on for a quick work around.