Anybody having success converting a scanned PDF to a usable Word document?
I have Acrobat 9 Pro. It is a scanned PDF. I did a optical character recognition on it (Acrobat funtion), and then a save as Word document (also tried an export). It makes a Word document alright. Just with countless text boxes, not paragraphs as they would be in Word. I tried some online conversion software but those are even worse. One of the ways turns each PDF page into a picture Word page. Interesting, but not useful.
Looking at upgrading to Acrobat X Pro, but cannot determine if it does this job any better than 9 Pro.
I hate the text box conversions. I've had much better luck using Nuance PDF Creator when going from PDF to Word.
I've never had any luck with this. If anyone has a good way to do it I would love to know!
After doing the OCR, have you tried to just copy/paste the text into Word? It won't really have formatting to it, but it's more useful than dealing with text boxes. I copy/paste text from policy forms into emails/documents on an occasional basis with decent luck. Granted, it's usually a couple of paragraphs or less.
Quote from: Jim Jensen on October 11, 2011, 09:45:19 AM
After doing the OCR, have you tried to just copy/paste the text into Word? It won't really have formatting to it, but it's more useful than dealing with text boxes. I copy/paste text from policy forms into emails/documents on an occasional basis with decent luck. Granted, it's usually a couple of paragraphs or less.
Hmmm, never thought about doing that. Will try that next time.
Quote from: Jim Jensen on October 11, 2011, 09:45:19 AM
After doing the OCR, have you tried to just copy/paste the text into Word? It won't really have formatting to it, but it's more useful than dealing with text boxes. I copy/paste text from policy forms into emails/documents on an occasional basis with decent luck. Granted, it's usually a couple of paragraphs or less.
That is indeed what I ended up doing. Just a lot of formatting to do. I could find & replace carriage return/linefeeds but that would make a real interesting mess. This is one of those A-1-b-iii kind of docs. Quite a few pages too. Good exercise for someone like me who knows Excel but despices Word since AmiPro. At least this one is not for the TAM/Word interface.
Yesterday the TAM/Word connection blew while I was setting up a 44 page document with fields from customer, policy and custom dec. The ASWord feature just went caplooey. It is then that I realized that the ASWord.dot is from 1997 (Word 6!) and how sad it was that no new model had been constructed since then.
Hans
I did large project recently and I used RTF rather than WORD doc as the Export To and it worked out pretty good....still had clean up to do but it saved a lot of work for me
You might also want to see if PaperPort is any better at it than Adobe is, and try both RTF as Jan mentioned, and also Word formats. I would guess that Since PP is made by Nuance, it might have the same conversion engine as the PDF to Word product does.
I'll bet if agency management systems really made the caplooey noise or actually exploded...they would get updated a LOT faster...
Quote from: Jeff Golas on October 13, 2011, 12:08:13 PM
I'll bet if agency management systems really made the caplooey noise or actually exploded...they would get updated a LOT faster...
...considering downloading an interesting sound and installing that as the default 'beep' for when TAM error messages come up. Maybe one could set a policy to enforce this sound. Maybe changing the contents of the file as need be.
I recall doing something with Windows 3.x and subliminal messages that would flash on the screen.
Quote from: HMan on October 13, 2011, 12:27:15 PM
Quote from: Jeff Golas on October 13, 2011, 12:08:13 PM
I'll bet if agency management systems really made the caplooey noise or actually exploded...they would get updated a LOT faster...
...considering downloading an interesting sound and installing that as the default 'beep' for when TAM error messages come up. Maybe one could set a policy to enforce this sound. Maybe changing the contents of the file as need be.
I recall doing something with Windows 3.x and subliminal messages that would flash on the screen.
Back on Window 3.11, I changed a programmers PC at the bank where my wife worked to substitute Homer Simpson's 'doh' for every Windows error sound. Keep trying to type when you can't and it came out "doh doh doh doh..."
Here's my trick to get a scanned PDF into Word:
Use Fax@ or another fax printer to get the PDF into TIF.
Open the TIF in MS Office Document Imaging.
Go to Tools - Send text to Word
It will ask where to save the Word doc.
It will prompt to run OCR. Say OK.
It should run OCR and then open Word.
Depending on the original, you may or may not get something you can actually use. Just today I had to use this with a scanned audit to get it into Excel for the client. It did keep the info in a table format that could be pasted back into Excel and was a lot cleaner than I expected. The main problems were with the original which was faxed, so OCR couldn't make out all the characters.
Just another idea to try.....
That's a great tip Debbie, thanks, didn't know about that.
I use ABBYY FineReader 10.0. It does a great job of converting PDF to WORD either from a file or from a scanner.