Announcement

Collapse
No announcement yet.

anyone know a good pdf to text conversion program

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • anyone know a good pdf to text conversion program

    Need to edit some scanned pdf files (ocr), and need to convert them to text format to do so. Lots on the web online, but I've tried a few, and none actually work well enough.

    thanks
    Please donate a dollar a day at http://justadollarplease.org.
    Copy and paste this message to the bottom of your signature.

    Thanks!

  • #2
    I use PDF Converter Pro available from Nuance. It handles a wide range of PDF conversion types. I use it a lot at work for pdf related tasks. It converts to MS-Word and unicode text formats. It ain't free but you can get a 30 day trial version from cnet.com.

    Comment


    • #3
      Thanks gregorious, I'll check it out. But I spent a few hours typing away like a madman, so I think I'm covered.
      Please donate a dollar a day at http://justadollarplease.org.
      Copy and paste this message to the bottom of your signature.

      Thanks!

      Comment


      • #4
        https://www.google.com/search?q=conv...hrome&ie=UTF-8

        An interesting article, especially give that there is an API you could write a script to do it in bulk if that is what you are looking for.

        Comment


        • #5
          "FreeOCR is a free Optical Character Recognition Software for Windows and supports scanning from most Twain scanners and can also open most scanned PDF's and multi page Tiff images as well as popular image file formats. FreeOCR outputs plain text and can export directly to Microsoft Word format."

          http://www.paperfile.net/download.html

          I use it, it's light weight and simple. You'll likely need to clean up the results some but it sure beats typing. And the price is right too. It will open/do/convert .pdf files.

          Comment


          • #6
            For MAC I use a program called "SOLID PDF to WORD" It works well and is accurate even if there are images, logos, etc.
            Disclaimer: Answers, suggestions, and/or comments do not constitute medical advice expressed or implied. Please consult your attending physician for medical advise and treatment. In the event of a medical emergency please call 911.

            Comment


            • #7
              http://download.cnet.com/PDF24-Creat...ml?tag=mncol;3
              this is one that I use
              C4 incomplete since 1985

              Comment


              • #8
                Thanks everybody, I only needed sections of two pdf files, so I typed it out. Took me half an hour, no biggie.
                Please donate a dollar a day at http://justadollarplease.org.
                Copy and paste this message to the bottom of your signature.

                Thanks!

                Comment


                • #9
                  Xpdf tools

                  Xpdf is a safe PDF reader, but has other commands too.

                  To read PDFs on a terminal I often use:

                  Code:
                  pdftotext -layout -enc Latin1 -eol unix -nopgbrk file.pdf - | less -S
                  Can also use LESSOPEN to read PDFs with this.

                  However, it does some times mash paragraphs together.

                  GNU less with the -SR options I use to read groff -ms -Tlatin1 output. The -S option preserves document integrity by preventing line breaking. The -R option is required for vt100 attributes. Run in screen if you don't have a vtXXX as I do on my ibm3151.

                  See man pages xpdf(1), pdftops(1), pdftotext(1), pdfinfo(1), pdffonts(1), pdftoppm(1), pdfimages(1), xpdfrc(5) and http://www.foolabs.com/xpdf/
                  http://zagam.net/

                  Comment


                  • #10
                    For future reference Adobe Acrobat Pro has built-in OCR.

                    I also use ABBYY FineReader Pro (http://finereader.abbyy.com/professional/).

                    Both are a bit pricy for casual use.

                    Comment


                    • #11
                      One more place on-line is Zamzar. Pretty much converts any type of file to any other type.

                      It is free with some restrictions.

                      http://www.zamzar.com/conversionTypes.php#documents

                      Comment


                      • #12
                        thanks chris, have you tried it?
                        Originally posted by chris-k View Post
                        One more place on-line is Zamzar. Pretty much converts any type of file to any other type.

                        It is free with some restrictions.

                        http://www.zamzar.com/conversionTypes.php#documents
                        Please donate a dollar a day at http://justadollarplease.org.
                        Copy and paste this message to the bottom of your signature.

                        Thanks!

                        Comment


                        • #13
                          ABBYY fine reader is awesome and Microsoft Onenote has a built in ocr

                          Comment


                          • #14
                            Originally posted by rdf View Post
                            thanks chris, have you tried it?
                            Yup, Zamzar is safe. I use it when someone sends me an odd file type I can't read or open.

                            Zamzar sends you an email with a link to the converted file, so you will need to provide an email address. So far (over 4 years) no spam.

                            Comment

                            Working...
                            X