by anandasim » Thu Dec 10, 2009 7:13 pm
Hi Joanne,
1. If you have a paper document and you are going to scan it, use an OCR program. Many of the scanners nowadays, even the below AUD 100 combo printer/scanner units come with OCR as well as Scan to PDF. For example my few years old HP 5510 has a French freebie OCR which I never used and my new Kodak low end combo unit, the ESP-3 costing AUD 70+ has a Scan to RTF as well as to PDF. It doesn't easily strike you that the RTF is actually Rich Text File format, readable by Word and other word processing programs and OCR (Optical Character Recognition) is being carried out.
2. If you scan to PDF, the issue is that PDF is a page layout file structure, not a word processing file structure. So the document may or may not store text but the text is in strips on the page, not word wrapped in nice paragraphs - for page layout and printing this is fine and you can do things like Redaction but to absorb the stuff as text that you can edit in a word processor, no, it's not good.
3. If your scanner software has been lost or it does not appear to you that you have OCR capability, you might try scanning the file as black and white TIFF files (they can be quite large), then collect all those 100 TIFF files and bring them to a machine at a University Library or a Public Library that has OCR facilities. Or there may be some businesses that offer this kind of scanning direct from paper.