Wrong PDF properties in XnView?

Ask for help and post your question on how to use XnView MP.

Moderators: XnTriq, helmut, xnview

IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Wrong PDF properties in XnView?

Post by IxenPDF »

Hello,

I am not shure if the following is a bug:

I open the following PDF-file with XnView MP: This is a 1 bit b/w document with 72dpi.
But according to XnView MP it is 300dpi, 24bit and RGB. (properties).

Isn't this a bug?

Thank you.
cday
XnThusiast
Posts: 3976
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Bug? PDF, properties

Post by cday »

IxenPDF wrote:I open the following PDF-file with XnView MP:
Das_erfolgreiche_Bewerbungsgespräch-72 s-w extr 11.pdf
This is a 1 bit b/w document with 72dpi.
But according to XnView MP it is 300dpi, 24bit and RGB. (properties).
Opening your file in a text editor, it appears to consist of vector text (a scaleable outline font as used in a word processor), rather than an image (as produced by a scanner) file... [If you save a Word document as a PDF, it will be saved as vector text.]

The DPI value displayed in XnView MP (or other image viewer) is the DPI value at which the file contents have been rasterised when it was opened: in XnView MP that value is set in: Viewer > File > Format settings... > Read tab -- PS/PDF

But there is clearly a problem somewhere, as the text doesn't reproduce well at even 300DPI and does indeed look like text reproduced at 72 DPI...

Can you provide any background to the origin of the file?
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Bug? PDF, properties

Post by IxenPDF »

Thank you for your reply.

This is not a vector text. It was produced with gImageReader (https://github.com/manisandro/gImageReader) I use this gImageReader to make PDF's with hidden text from pictures.

And yes, the quality of this picture is rather bad. But do not take care of the quality. It's just an example. I am only interestet in the dpi and other property information. I want to find out the dpi of this picture (and of other pictures).
cday
XnThusiast
Posts: 3976
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Bug? PDF, properties

Post by cday »

IxenPDF wrote:This is not a vector text.
The file seems to contain font information, but I only had at a quick look and a little knowledge can be dangerous... :wink:
It was produced with gImageReader (https://github.com/manisandro/gImageReader) I use this gImageReader to make PDF's with hidden text from pictures.
OCR the image to a searchable image file?
And yes, the quality of this picture is rather bad.
A settings problem, or does it really not matter?
But do not take care of the quality. It's just an example. I am only interested in the dpi and other property information. I want to find out the dpi of this picture (and of other pictures).
Images in PDF files have a fixed number of pixels, but DPI is not itself a measure of quality, for example. If the original image was scanned, then the DPI could be determined from the overall pixel dimensions and the print size shown in the file properties.

Anyway, no reason to suspect a bug in XnView MP...

Edit:

The PDF file is indeed searchable [the word 'den' just discernable in the image is found correctly], which presumably accounts for the font information contained in it...
User avatar
helmut
Posts: 8705
Joined: Sun Oct 12, 2003 6:47 pm
Location: Frankfurt, Germany

Re: Bug? PDF, properties

Post by helmut »

To me this looks as if someone generated this document using a special font to make the document unreadable.

Open the PDF, Ctrl+A (Select all), Ctrl+C (Copy) and the Ctrl+V (Paste) in Notepad reveals the full text.
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Bug? PDF, properties

Post by IxenPDF »

cday wrote:OCR the image to a searchable image file?
Yes.
cday wrote: The DPI value displayed in XnView MP (or other image viewer) is the DPI value at which the file contents have been rasterised when it was opened: in XnView MP that value is set in: Viewer > File > Format settings... > Read tab -- PS/PDF
I did not found "Format settings" in this menu:
View-File.png

@helmut:
What I am interested in:
How can I find out the picture settings of a PDF. For example: Width, Height , # of bits, Color model, DPI, Compression? I thought maybe I can get this information with the help of XnView. So I tryed Menu->Edit->Properties in the hope to get this information. But if this is really not a bug of XnView, it seems that here I can not get this information.
Is there no way to get this information with XnView?


Please take this file. I have created it with XnView himself. It is 600dpi und 1bit b/w. How can I get this information from XnView for any PDF and for this PDF?
lösch.pdf
(73.31 KiB) Downloaded 82 times
cday
XnThusiast
Posts: 3976
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Bug? PDF, properties

Post by cday »

IxenPDF wrote:
cday wrote: The DPI value displayed in XnView MP (or other image viewer) is the DPI value at which the file contents have been rasterised when it was opened: in XnView MP that value is set in: Viewer > File > Format settings... > Read tab -- PS/PDF
I did not found "Format settings" ...
It's in the Viewer: File > Format settings... > Read tab -- PS/PDF
IxenPDF wrote: @helmut:
What I am interested in:
How can I find out the picture settings of a PDF. For example: Width, Height , # of bits, Color model, DPI, Compression? I thought maybe I can get this information with the help of XnView. So I tryed Menu->Edit->Properties in the hope to get this information.
See in the Viewer Edit > Properties...

Image_1.png
Image_1.png (23.9 KiB) Viewed 7302 times
cday
XnThusiast
Posts: 3976
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Bug? PDF, properties

Post by cday »

Note that the properties shown in Edit > Properties... are the properties of the image after it is opened in XnView, and so are the pixel dimensions (for example) after the image has been rasterised at the DPI value set in the program as above.

To obtain details of the PDF file itself it would be necessary to inspect the code in the original file opened in a text editor, and refer to the PDF standard, 756 pages long...

At a quick look (again...) the image in the file is black and white saved with CCITT 'Fax' compression, which accounts for its very reasonable file size, not 24-bit colour as shown for the file opened in XnView MP (and another freeware viewer).
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Bug? PDF, properties

Post by IxenPDF »

cday wrote:Note that the properties shown in Edit > Properties... are the properties of the image after it is opened in XnView, and so are the pixel dimensions (for example) after the image has been rasterised at the DPI value set in the program as above.
At a quick look (again...) the image in the file is black and white saved with CCITT 'Fax' compression, which accounts for its very reasonable file size, not 24-bit colour as shown for the file opened in XnView MP (and another freeware viewer).
Thank you. So it seems there is no possibility to get this information with XnView. It's a pity.
cday wrote:To obtain details of the PDF file itself it would be necessary to inspect the code in the original file opened in a text editor, and refer to the PDF standard, 756 pages long...
I renamed the file name.PDF to name.TXT to open it with Microsoft Editor. But this way I only got a featureless character string.
cday
XnThusiast
Posts: 3976
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Bug? PDF, properties

Post by cday »

IxenPDF wrote:
cday wrote:Note that the properties shown in Edit > Properties... are the properties of the image after it is opened in XnView, and so are the pixel dimensions (for example) after the image has been rasterised at the DPI value set in the program as above.
At a quick look (again...) the image in the file is black and white saved with CCITT 'Fax' compression, which accounts for its very reasonable file size, not 24-bit colour as shown for the file opened in XnView MP (and another freeware viewer).
Thank you. So it seems there is no possibility to get this information with XnView. It's a pity.
There may well be utilities that show that information, but I don't currently know of any...
IxenPDF wrote:
cday wrote:To obtain details of the PDF file itself it would be necessary to inspect the code in the original file opened in a text editor, and refer to the PDF standard, 756 pages long...
I renamed the file name.PDF to name.TXT to open it with Microsoft Editor. But this way I only got a featureless character string.
When the file is opened in a text editor much of the contents will be incomprehensible, but there will also be small amounts of text that can give important clues as to the contents, for example:

Code: Select all

obj << /Type /XObject /Subtype /Image /Name /Im0 /Filter [ /CCITTFaxDecode ] /DecodeParms [ << /K -1 /Columns 2835 /Rows 2209 >> ] /Width  2835 /Height 2209 /ColorSpace 11 0 R /BitsPerComponent 1 /Length 10 0 R >>
That, I believe, is the overall definition of the page image and probably gives the information you are seeking if you could understand it: the inclusion of 'CCITTFaxDecode' to me indicates that the image in the file is saved in black and white, as that compression method is only valid for black and white.

Here is a link to the basic PDF spec : PDF Reference 1.7 : let me make absolutely clear that I only understand a very small subset!! :wink:

You could also no doubt find more information on line.
IxenPDF
Posts: 61
Joined: Tue Aug 02, 2016 8:13 am

Re: Bug? PDF, properties

Post by IxenPDF »

Thank you @cday for this. Now I found the string also in my own name.txt

If anybody knows a little tool which can simply show this information: Please let me know.
User avatar
XnTriq
Moderator & Librarian
Posts: 6336
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Re: Wrong PDF properties in XnView?

Post by XnTriq »

There are obviously several problems with Das_erfolgreiche_Bewerbungsgespräch-72 s-w extr 11.pdf: One is the font encoding, …
  • Arial Narrow (Embedded Subset)
    • Type: TrueType (CID)
    • Encoding: Identity-H
    • Object Number: 4
… and the other is the fact that the embedded low-res image…
Extracted with an older version of http://www.tracker-software.com/product/pdf-tools
Extracted with an older version of http://www.tracker-software.com/product/pdf-tools
p138823.png (10.36 KiB) Viewed 7273 times
… was placed on top of the selectable text…

Code: Select all

10 Das erfolgreiche Bewerbungsgespräch
wörtlich?Was steckte wirklich hinter der Frage?Wie hätte sie antworten sol-
len, um in die engereWahl zu kommen? Klare Antworten darauf finden Sie
in diesem Buch.
Das Bewerbungsgespräch hat viele Ähnlichkeiten mit einer gepflegten
Konversation. Letztlich erhält der Bewerber ein Vertragsangebot, dem es
gelingt, eine einseitige Prüfung seiner KompetenZen in einen dynamischen
Gedankenaustau sch zwischen zwei Profis zu verwandeln. Hier lernen Sie die
Techniken, mit denen Sie das Interesse Ihres Gesprächspartners wecken,
dauerhaft fesseln und sich als bester Bewerber für die Zu besetZende Position
empfehlen können.
Das efiblgreiche Bewerbunggesprcich ist Ihr Schlüssel zur erfolgreichen
Bewältigung selbst der schwierigsten Einstellungsgespräche. Das Buch
besteht aus vierTeilen.]ederTeil bereitet Sie in unterschiedlicherWeise auf
das Bewerbungsgespräch und den AuswahlproZess vor. »Umfassende Vorbe-
reitung, komplette Unterlageme macht Sie kampfbereit.Anschließend wer-
den Sie lernen, einen überZeugenden Lebenslaufzu schreiben, der Sie aus der
Masse der Bewerber herausragen lässt. Weiterhin werden Sie lernen, fi:eie
Stellen, die in keiner Zeitung erscheinen, ausfindig zu machen.
Wir leben in einer Gesellschaft, die von Berufsarbeit geprägt ist. Oft wird
beklagt, es gebe zu wenig Stellenangebote.Tatsächlich gibt esMenschen, die
es aus Furcht vor der Mühe und den Anstrengungen, die mit einer aktiven
Stellensuche verbunden sind, vorziehen, in Passivität zu verharren und zu
klagen. Wenn Sie wissen, wo und wie Sie vorgehen müssen, werden Sie
offene Stellen aufspüren, die nie in den Zeitungen auftauchen. Sie werden
lernen, den verborgenen Stellenmarkt zu erschließen.
Nachdem Sie sich umfassend vorbereitet haben, folgt »Das Startsignalcr.
Dieser Buchabschnitt beschreitiorgehensweisen, mit deren Hilfe Sie Ein-
ladungen zu Bewerbungsgesprächen erhalten, und bringt Ihnen einfache
und wirkungsvolle Methoden bei, um möglichst viele Bewerbungsgespräche
zu vereinbaren. Der Abschnitt endet mitTechniken, die Sie dabei unterstüt-
Zen, auch eine telefonischeVorselektion —die eine zunehmendeVerbreitung
findet —glänZend zu bestehen.
Das egfialgreicheBewerbungsgesprächvermittelt Ihnen das Wissen sowie die
Techniken und Methoden, die Sie für eine Stellensuche brauchen. Darüber
hinaus lehrt es Sie einige wertvolle Lektionen aus dem Berufsleben, die zu
Ihrem zukünftigen Erfolg beitragen werden. Alle erfolgreichen Unterneh-
men legen bei ihren Mitarbeitern auf die gleichen Dinge Wert.Aber auch
… by the OCR software.
User avatar
XnTriq
Moderator & Librarian
Posts: 6336
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Re: Bug? PDF, properties

Post by XnTriq »

IxenPDF wrote:If anybody knows a little tool which can simply show this information: Please let me know.
The version of pdfimages that comes with the precompiled Windows binaries of Xpdf unfortunately doesn't support the -list option for reporting the resolution of embedded images :-|

Update:
[quote="pdfimages.exe -list "Das_erfolgreiche_Bewerbungsgespräch-72 s-w extr 11.pdf""]

Code: Select all

page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image     550   851  gray    1   1  ccitt  no         3  0    72    72 12.9K  23%
[/quote]
[color=green]pdfimages.exe -list [/color][url=http://newsgroup.xnview.com/download/file.php?id=3687][color=green]lösch.pdf[/color][/url] wrote:

Code: Select all

page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    2835  2209  gray    1   1  ccitt  no         9  0   300   300 71.9K 9.4%
cday
XnThusiast
Posts: 3976
Joined: Sun Apr 29, 2012 9:45 am
Location: Cheltenham, U.K.

Re: Wrong PDF properties in XnView?

Post by cday »

Based on what IxenPDF has written earlier in the thread he is trying to produce a searchable image PDF file using freeware software, so I think the image is supposed to be in the foreground: the problem is that something seems to have gone wrong when the file was created... :wink:

Looking at the file opened in a text editor, I think all the parameters requested are there if the code can be understood; the page size for this single-image file is given in the 'MediaBox' & 'CropBox' values based on a scale of 1/72 inch [I'm fairly sure], and the image pixel dimensions are given by the 'Columns' and 'Rows' values following the 'CCITTFaxDecode' parameter. That information could be checked or refined by [very] selective reference to the PDF specification. [For reference, as a croos-check the page dimensions can be read directly in the File > Properties... > Description tab [Ctrl + D] in Adobe Acrobat]

For the lösch.pdf, the second file uploaded, I calculated a DPI value of 300 rather than the 600 stated by IxenPDF, but that was only a quick calculation so I might be wrong.

Edit:

I see that XnTriq has updated his earlier post and and that my 300 DPI is correct, and more usefully he seems to have found just the tool required to obtain the required information the easy way... :D
User avatar
XnTriq
Moderator & Librarian
Posts: 6336
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Re: Wrong PDF properties in XnView?

Post by XnTriq »

lösch.pdf converted to searchable PDF with the OCR feature of PDF-XChange Editor (free version):
Attachments
p138829.pdf
(472.2 KiB) Downloaded 56 times
Post Reply