An Introduction to Digitisation Part 2 of 2: Processing the Scanned Image

In Part 1 of this series I looked at basics of scanning a book or article. In this section I look at how to processing the scanned  images into OCRed PDF files..

1) Rotating and cropping
Processing can be done straight after scanning or done in a batch later. Using Advanced Tiff Editor rotate the images (using Select All [CONTROL + A] first). Resize the image so you can see the whole page on the screen and use the crop tool to cut it down to the required size.

2) Cleaning up gutters, spots and shadows
Using the cut tool, cut out gutter-shadows, dark edges, spots or other marks. Once all the pages are completed, save the file and exit the program.

Converting to an OCRed PDF

Open Adobe Acrobat and use the “Recognise Text in Multiple Files” tool to select all the finished files. Select a 300dpi overlay and run.

Final optimisation and output

The outputted pdf file appears automatically on your disk. Check the file size and the quality. If the image quality is poor or the file seems excessively large (experience will tell you this) run Adobe’s file optimisation option.

I make no claim that is the definitive introduction, but it is based on several years experience acquired using the hardware and software described. Can anyone contribute any insights they have gained in doing similar work? If so, please leave a comment.

