Open Source OCR on MacOS

macos sucks

April 15, 2020

MacOS Catalina (10.15) broke many things on my computer, but the one that has wasted the most of my time so far while I try to make old PDFs available to vision-impaired students is the fact that Apple broke Adobe Acrobat Pro and the nasty Creative Cloud license manager that goes with it. I lost most of a day trying to get all that closed source nastiness working again before deciding to solve the problem the way I always should have, with open source software. This little script uses ghostscript and tesseract to turn an image pdf into an OCRd version of that pdf.