which optimal settings for a captured page?

Post by andreasm82 »

Which settings are the best, that means the best compromise between filesize and quality of pages which are captured by a digital camera or scanner?

Here's a sample file:


How small can I get it without any disturbing blocks?
Which file-format is the best (JPG; PNG, gif or anything else )?
Post by Olivier_G »

What do you think of this one (28KB):

-> Increased Contrast & lowered Brightness (Image>Adjust>Brightness/Contrast...)
-> Convert to binary (no dither)
-> saved as PNG
(you need to test/try to find the best Contrast/Brightness/... settings)

But the best answer is probably to use an OCR (optical character recognition) application... :mrgreen:
Re: which optimal settings for a captured page?

Post by helmut »

Your sample file is a grey scale scan of a black/white text with some forumlas on it.

I would recommend the following steps with the Original scan (not the JPG image):
1. Increase contrast so that the white background really is white. Instead of increasing the contrast you could set the whitepoint.
2. Convert your image to grey scale with 16 colours (grey tones)
3. Save the image as PNG, TIFF, or GIF.

JPG is definetely the wrong format since your scan contains mainly text. And high JPG compression will cause bad artefacts on your text.

FAQs What format should I save this image in? and How to resize images with text (screenshots) might be of some interest.
Post by andreasm82 »

thanks, merci, dankeschön :)

I'll try your ideas!

How can I set the whitepoint ?
Post by Drahken »

Whitepoint and blackpoint are under Image->Adjust->Levels. Setting the whitepoint to a lower number will make the grey areas lighter. If the image becomes too washed out, increase the blackpoint.

The best format for an image like this would be DJVU, but xnview isn't capable of saving DJVU format (few progs are, unfortunately). It can read DJVU though.

You should always work with the original or a lossless copy of the original file, never with a JPG. Even with compression set as low as possible, JPG always introduces artifacts that can't be removed. Also, try to flatten out the paper more when you're scanning it. Keeping those lines nice & straight will not only look better, but will also make a smaller file.

http://drahken.t35.com/page-scan-1.png <-53k 4 grey scale. Very good quality.
http://drahken.t35.com/page-scan-2.png <-24k bitonal, no dither. No so good quality, but still perfectly useable.
http://drahken.t35.com/page-scan-a.djvu <-31k "scanned" Excellent quality, and less than 1/10th the filesize of your original pic. (Quality & filesize would be even better if it were made from the original scan.)
http://drahken.t35.com/page-scan-b.djvu <-13k "clean" Good quality.
http://drahken.t35.com/page-scan-c.divu <-10k "bitonal" Not so good quality, but still useable.

Note: The first DJVU file listed will appear too large in xnview, this is because regular image viewers don't adjust the displayed pic based on DPI. A dedicated DJVU viewer will show it at the proper size. You can get it to close to the correct size quite easily by hitting the minus key about 4 times. It should also be noted that when the image is reduced to the proper size, the letters and such are much smoother.
Here's a screenshot of what it looks like in a DJVU viewer: http://drahken.t35.com/page-scan-windjvu.png
