Call: 01647 433276

Easy OCR

14-10-2019 by Andrew Parrott

Easy OCR
Photo by Alexander Andrews on Unsplash

Using Google Drive it is easy to extract text from scanned documents

There are several projects I work on where I often find myself needing to extract a portion of text from a PDF document and paste it into another document or web page.

If these documents were originally saved from a word processing document then it is no problem as the text can simply be selected and then copied. However, if they originate from a scanned document then the document will be a single image with no selectable text.

Some scanners include OCR software which will extract any text from the scanned document but if you don't have any software to do this you can use Google Drive.

If you already have a Google account then you will have Google Drive already. If not you will first need to create a free Google account.

To extract text from any PDF, proceed as follows:

Log in to your Google Drive account and click the New button at the top left of the screen

File upload

On the menu that appears, click File Upload

Select the file to upload and then click Open

Upload progress

You'll see a message in the bottom left hand corner of the screen showing the progress of the upload. Once the upload is complete, click on the filename to view it

Open in Google Docs

Select Open with Google Docs from the dropdown list at the top centre of the viewer screen

You will now be able to select any text in the document.

The Optical Character Recognition is not 100% accurate and varies depending on the quality of the original scan. Best results are obtained from higher resolution scans of typed text (handwritten text gives mixed results).

Subscribe to our newsletter for latest industry news and tips.

Sign Up