Follow

Heyo! Can anyone recommend a free (as in beer) option for transforming image PDFs to OCR'd PDFs [1] ? French support + macOS required, FLOSS preferred.

[1]: I'm not sure if I'm very clear 😕. Here's my use case: I have an app on my phone that scans documents to PDFs but it doesn't do any OCR. I also have a bunch of digital documents for which I don't have a paper version anymore. I'd like to OCR these documents to make them searchable and allow copy/paste.

@Crocmagnon I have used PDFSandwich

http://www.tobias-elze.de/pdfsandwich/

OCR is done by tesseract, which isn't top grade but works for me.

@Crocmagnon @mike I recommend this service: doxisafe.me/#/safe/start
They have their own KI named Deeper and it's not Google's Tesseract plus an extra cloud service for free. Besteht I found so far.

@Crocmagnon if you don't need them to be directly stored on your phone, you can selfhost paperless-ng! It's a great app I selfhost at home and it has mobile apps that allow you to upload scans directly from your phone. The machine hosting it will then do OCR and the web interface let's you search through tags or OCR content

@Crocmagnon This might be quite a bit of overkill but I had paperless running for a couple of years and it did its job wonderfully. It's a server architecture that ingests everything in a folder, OCRs and files it for you. While the original isn't maintained there is github.com/jonaswinkler/paperl nowadays, though I have not tried this fork.

Sign in to participate in the conversation
Fosstodon

Fosstodon is an English speaking Mastodon instance that is open to anyone who is interested in technology; particularly free & open source software.