Forums
May 03, 2024, 08:44 PM

Author Topic: images into pdf text, helps?  (Read 1057 times)

0 Members and 1 Guest are viewing this topic.

Offline HHC

images into pdf text, helps?
« on: December 13, 2014, 01:18 PM »
Anyone know a good way of putting images (of text) into a READABLE pdf file?

I read on the internetz you can do it by opening the images in preview and then printing them to a PDF-writes, but I cant seem to get it to work for more than 1 page and the quality was shitty too.  :(

The text images are black and white, so thats the format for the PDF too... Whats a reasonable size for PDF file of this format when there's about 200 pages in it  ???

Offline Tomi

Re: images into pdf text, helps?
« Reply #1 on: December 13, 2014, 03:24 PM »
Maybe you should convert pdf to docx, and in Word you can insert your image into the text, and then you can save it into pdf again.

Offline Casso

Re: images into pdf text, helps?
« Reply #2 on: December 13, 2014, 03:49 PM »
I usually use ABBYY FineReader 12: http://finereader.abbyy.com/
it works very well if images are in good quality.
At University I had a lot of scanned pages of some books and this software converted in few minutes hundred of pages (just images) in a readle pdf file where it's possible to highlight, search, and underline text.
it uses the technique OCR to recognize words.

Maybe you can try it and if you like Abbyy you can "buy" ( 8)) this

Offline Maciej

Re: images into pdf text, helps?
« Reply #3 on: December 13, 2014, 04:10 PM »
Do you mean something like this? http://convertonlinefree.com

Offline HHC

Re: images into pdf text, helps?
« Reply #4 on: December 13, 2014, 06:56 PM »
I usually use ABBYY FineReader 12: http://finereader.abbyy.com/
it works very well if images are in good quality.
At University I had a lot of scanned pages of some books and this software converted in few minutes hundred of pages (just images) in a readle pdf file where it's possible to highlight, search, and underline text.
it uses the technique OCR to recognize words.

Maybe you can try it and if you like Abbyy you can "buy" ( 8)) this

I have ABBYY 11, it's a good program yeah, but for some reason it opens the images in pretty crappy resolution, and I cant seem to figure out how to improve the quality of the scan  :-[
If I could improve the quality it would probably have very few text that would need fixing and I could save it there as plain text (in html or doc) instead of working with big ass .PDF files. But yeah... I need to improve the resolution somehow.


Left ABBYY, upper right PDF output, down right, the original image.  :-[

Is it because the images are too wide maybe? ???

edit: it says 'resolution of source img is too small and the image has been resized to larger'. Then it says 'increase resolution to 300dpi or higher'. But howwwww :(

If u know how to fix this casso... it would solve everything.

Got that ABBYY for my bookscanner btw, I dont recommend it to anyone else cause its cock-expensive  :D
« Last Edit: December 13, 2014, 07:28 PM by HHC »

Offline Casso

Re: images into pdf text, helps?
« Reply #5 on: December 13, 2014, 10:30 PM »
I have ABBYY 11, it's a good program yeah, but for some reason it opens the images in pretty crappy resolution, and I cant seem to figure out how to improve the quality of the scan  :-[
If I could improve the quality it would probably have very few text that would need fixing and I could save it there as plain text (in html or doc) instead of working with big ass .PDF files. But yeah... I need to improve the resolution somehow.

I guess that the problem is not how ABBYY opens the images but how it saves them. By default this software convert/save all images with a balanced quality to make the output files smaller.
I would try to change this options here: Tools - Options - Save - PDF (or PDF/A if you use it instead) and select "Best quality" under "Image settings".

Try if it fixs your problem ;)



btw, if this won't fix I can try to convert one of your files to verify if I get the same error. ;)

« Last Edit: December 13, 2014, 10:34 PM by Casso »

Offline HHC

Re: images into pdf text, helps?
« Reply #6 on: December 13, 2014, 10:50 PM »
Doesn't help :(



See what this does when you open it. Which errors it gives.

Offline HHC

Re: images into pdf text, helps?
« Reply #7 on: December 19, 2014, 03:37 PM »
Using google drive for it now..



It opens the image in a doc file and automatically OCR's it. The quality is pretty darn good, equal to FineReader's ability IMO (it read italic as normal, put 0's into o's and it fails to underline words it has trouble to read.. only words that have SPEELING ERROR), but other than, fine.

Only downside is that you have to open each image and copy/paste the OCR'ed text (of singles pages) into a new big one that comprises the entire book.

But well, it's do-able.

PDF seems to comprise the image too much and make it hard to read for OCR software. And apparently opening images in this software gives the same effect. Google drive at least reads the high quality image and therefore gives better result.