If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Omnipage Pro 14 - How do I save OCR data in TIFF so I can search document with Home XP Documents search?
I want to scan documents, save the image for legal reasons but be able to
search the documents with the Start/Search/Documents utility. I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more accurate than the MS Document Imaging so I'd like to use it. However, when I scan a document, OCR it and save it to TIFF format the file does not show up as being OCR'd by XP and documents scanned with Omnipage don't show up in Start/Search/Documents search. Documents scanned and OCR'd with Office Document Imaging are found correctly by the Start/Search/Documents. How do I save my OCR'd Omnipage documents in a TIFF so the data is available to MS Document search? I realize I could save as TIFF and searchable Word document but I'd rather have a single source picture/OCR data file. I could save as PDF but I believe PDF but XP cannot search a PDF file. I'd appreciate any advice. thanks rj |
#3
|
|||
|
|||
"RJ" wrote in message
... I want to scan documents, save the image for legal reasons but be able to search the documents with the Start/Search/Documents utility. I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more accurate than the MS Document Imaging so I'd like to use it. However, when I scan a document, OCR it and save it to TIFF format the file does not show up as being OCR'd by XP and documents scanned with Omnipage don't show up in Start/Search/Documents search. Documents scanned and OCR'd with Office Document Imaging are found correctly by the Start/Search/Documents. How do I save my OCR'd Omnipage documents in a TIFF so the data is available to MS Document search? I realize I could save as TIFF and searchable Word document but I'd rather have a single source picture/OCR data file. I could save as PDF but I believe PDF but XP cannot search a PDF file. I'd appreciate any advice. thanks rj TIFF is an image(picture) format. It is not searchable, its just dots. I think you mean .RTF this is a text format that allows different fonts and is searchable. You could open them in Word and save them as a .DOC. It would be easier to search for the text using *.rtf from the start button. To have the original image(picture) and to be able to search it requires 2 files. The tex/doc file and the image file. HTH |
#4
|
|||
|
|||
RJ wrote:
How do I save my OCR'd Omnipage documents in a TIFF so the data is available to MS Document search? You can't. OCR takes you from an image file (like TIFF) to a text file or word processor file. That kind of file can be word-indexed for searches. Going back to an image file loses that benefit: you convert words to pixels, and lose the words. And you need to OCR it over again to get back at the words. If you want to keep the original image, but with the text for searches, PDF (as you note) or perhaps DjVu would be the right choices. I could save as PDF but I believe PDF but XP cannot search a PDF file. Adobe Reader can -- and you're going to need for accessing the PDF file anyway. If you have very many PDF files, talk to Adobe -- I'm almost sure they have a solution. It may not fit your budget, though. I believe DjVu now is capable of something similar, but I don't think many OCR programs can save to that format directly. -- Anders Thulin ath*algonet.se http://www.algonet.se/~ath |
#5
|
|||
|
|||
In article , Anders Thulin
writes I could save as PDF but I believe PDF but XP cannot search a PDF file. Adobe Reader can -- and you're going to need for accessing the PDF file anyway. If you have very many PDF files, talk to Adobe -- I'm almost sure they have a solution. It may not fit your budget, though. For a cheap way to search PDF files, you could try SearchWithin from http://www.software995.com/ -- Graham Jones http://www.visiv.co.uk Emails to may be deleted as spam Please add a j just before the @ to ensure delivery |
#6
|
|||
|
|||
"RJ" wrote in message
... I want to scan documents, save the image for legal reasons but be able to search the documents with the Start/Search/Documents utility. I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more accurate than the MS Document Imaging so I'd like to use it. However, when I scan a document, OCR it and save it to TIFF format the file does not show up as being OCR'd by XP and documents scanned with Omnipage don't show up in Start/Search/Documents search. Documents scanned and OCR'd with Office Document Imaging are found correctly by the Start/Search/Documents. How do I save my OCR'd Omnipage documents in a TIFF so the data is available to MS Document search? I realize I could save as TIFF and searchable Word document but I'd rather have a single source picture/OCR data file. I could save as PDF but I believe PDF but XP cannot search a PDF file. I'd appreciate any advice. thanks rj Well in the first place TIFF is a image only format. (A picture). To have a searchable document, It must be OCR'ed to a Text format such as Microsoft Word or a plain TXT file. You can also save a searchable PDF from Omnipage Pro 14. To search a PDF you need Adobe Acrobat Reader. http://www.adobe.com/products/acrobat/readstep2.html -- CSM1 http://www.carlmcmillan.com -- |
#7
|
|||
|
|||
RJ Wrote: How do I save my OCR'd Omnipage documents in a TIFF so the data is available to MS Document search? I realize I could save as TIFF and searchable Word document but I'd rather have a single source picture/OCR data file. I could save as PDF but I believe PDF but XP cannot search a PDF file. I'd appreciate any advice. thanks rj Hello rj There is a product called DocSmart (free download from www.versis.co.uk) which we use to achieve what it sounds like you are trying to achieve. DocSmart can do instant full text search and retrieval on the text content on TIFFs, DjVu, and PDFs (and all electronic files as well). You can preview and/or open the files and/or use the built in Windows Explorer type functions on the files (eg. Print, Copy, Send, etc..). Bonus - the workstation version is free. NOTE - regarding DjVu file format: When you create a DjVu file it's OCR is done by the IRIS engine and so it is really accurate. DocSmart searches this OCR'd text content. We used to use TIFF but had the same problem as you are having and then we needed colour scanning for some documents and so TIFF was no good so we used PDF for a while. However the files were still too large most of the time and so we now use DjVu for everything and have never looked back. A 300dpi scanned A4 page in colour is about 50Kb and looks exactly the same as the TIFF or PDF version (24Mb and 5Mb respectively). You view DjVu doc's in Internet Explorer. I can't understand why more people are not going crazy over the DjVu format as it is a life saver for anyone who needs to scan paper or send really small file sized, non-editable electronic files (eg. a Word, Powerpoint, CAD document...). I hope this helps. Cheers Barry -- Barry |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
IDE RAID- Major problem need to save my data | MC | General | 21 | December 5th 04 05:38 AM |
XP Home Document Imaging - How to Edit and Resave corrected OCR data in scanned TIFF file? | RJ | Scanners | 2 | November 12th 04 08:33 AM |
Help save my data | Peter | General | 5 | November 9th 04 07:36 PM |
Please Help me choose momory for AMD64 | Synapse Syndrome | Asus Motherboards | 11 | August 26th 04 02:43 PM |
my new mobo o/c's great | rockerrock | Overclocking AMD Processors | 9 | June 30th 04 08:17 PM |