How to Convert Image to editable text with Tesseract OCR on Ubuntu

First, watch this video for the full meal deal.

However, for a quick and dirty summary of how to quickly convert a pretty bad image / photo (or pretty good one) into a .txt file, here we go:

1. Install Tesseract

sudo apt install tesseract-ocr

2. Convert the image (ie. .jpg, pdf, etc) into a .tiff file with Imagemagik to make it ready for Tesseract

Ubuntu (I think) comes with Imagemagik (or whatever it’s called) so when you run the ‘convert’ command it runs imagemagik. Anyways, here is what I did:

navigate to where the source images are with command line in terminal
convert whatever the image is into a tiff file with this command from video (adjust accordingly)
convert -density 300 IMG_input_image_1234.jpg -depth 8 -strip -background white -alpha off IMG_output_image_1234.tiff

Now you shouuld have IMG_output_image_1234.tiff in your directory

3. Convert the TIFF to TEXT

tesseract IMG_output_image_1234.tiff IMG_output_text_1234

Now you should have IMG_output_text_1234.txt in your directory.

Note that I didn’t add .txt in the output command. Seems Tesseract ‘just does that’

Note also that you can only do one language at a time and default is English. If you need another language you have to do that on a second round and do some other stuff in the command line I recall..

Hope this helps

Tagged convert, free, image, ocr, tesseract, text

Wayne Out There

Stuff that matters to Wayne…

How to Convert Image to editable text with Tesseract OCR on Ubuntu

1. Install Tesseract

2. Convert the image (ie. .jpg, pdf, etc) into a .tiff file with Imagemagik to make it ready for Tesseract

3. Convert the TIFF to TEXT

About justadminnit

Leave a Reply Cancel reply

Wayne Out There

Stuff that matters to Wayne…

1. Install Tesseract

2. Convert the image (ie. .jpg, pdf, etc) into a .tiff file with Imagemagik to make it ready for Tesseract

3. Convert the TIFF to TEXT

Related Posts

How to Reset Admin Password in Mautic in Docker Environment

How to Reset your Stalwart Mail Admin Password

How to share files on your Ubuntu Touch device with your LAN network

About justadminnit

Leave a Reply Cancel reply