Wednesday, April 23, 2008

Use OCR softwares to copy-paste from printed books

Copy-paste is the most used and misused function among all computer users. Thanks to Apple, who first brought this functionality to the computer, copying and pasting has saved us from hours of typing and editing. But the copy-paste we are familiar with works only with the digital medium. What if you want to copy a paragraph from a text book, or say you want to digitize an entire book so that you can read it in your cell phone or in your laptop? How do you copy from a printed text book?

The answer is the OCR (Optical character recognition) software. An OCR software translates a typewritten, printed or even handwritten books into machine readable texts from scanned images of printed texts. The OCR software uses a combination of different technologies like pattern recognition, artificial intelligence and machine vision to "guess" the characters on the image and then outputs a text version of the scanned image. Today's OCR softwares are up to 99% accurate.

Only a handful of OCR applications available are freeware. SimpleOCR, TopOCR, and FreeOCR are some of the free ones, while ABBYY FineReader and OmniPage are some of the best commercial versions available

SimpleOCR converts your scanned images to text files or Word documents. It also comes with a developer toolkit that may be used to add OCR to custom software application. SimpleOCR lets you manually specify image zones for scanning and supports English, French and Dutch dictionaries.


FreeOCR is a complete scan and OCR program including the Windows compiled Tesseract free ocr engine which is developed by Google. FreeOCR supports multi-page tiff's, fax documents as well as most image types including compressed tiff's. FreeOCR also supports manual zoning, which allows you to select an area to process. This helps to increase the accuracy by eliminating borders, pictures etc. Also this makes the software useable to OCR documents which contain columns.


TopOCR is supports digital cameras, camera phones and scanners, and can recognize 11 different languages. Unlike OCR applications designed for scanners, TopOCR understands that images from digital cameras and smartphones have more variation in image quality. As a result, the OCR processing in TopOCR is more robust and utilizes sophisticated image processing algorithms that the older OCR programs completely lack.


qipitThere is also a online version of OCR called Qipit. Take a photo of the documents with a digital camera or camera phone and upload it to Qipit. You'll then receive a link to the online digital copy of your document.

1 comment:

  1. IS there an OCR software which allows one to paste image into it and extract text?


Contact Form


Email *

Message *

Popular Posts