1. It is very easy to implement it.
which format use that? OpenOffice ok...
In our discussion: a lot of formats
Technically precise, we speak about several formats: OLECF, Open Office XML, Office Open XML (yes, there are TWO different and concurrent (!) formats), OpenDocument etc. Yep, it is a mess if we want to enter into details.
...but for us the things are pretty simple, because we want only the thumbnail.
>>>> The short story of all this: (you can skip if you want, but IMHO is better to read)
When the programs and their files become more and more complex the costs with the persistence engines ("save & load") had grown astronomically, not only because of Research & Development involved in building such engines but also in maintaining the compatibility.
Due of the inherent object-oriented nature of the programs there was a desperate need of an abstract persistence subsystem in which the developer can save the RTTI of the objects. Firstly, there was the INI files with their "[section] & name=value" architecture, thing which we have in our config.ini . However, the things grown and it was needed to save multiple streams of data from complex objects but these streams must appear to the user as a single file.
So the idea of a "file system inside of a compressed file
" appeared. This idea is at the base of all these formats.
In short, ALL these formats are zip files with different extensions
Depending of the format, the files inside of the archive have (very) different meanings but - thanks God! - (almost?) all of them have a very-easy-and-quick-to-extract thumbnail readily available in a standard graphic format readily available for Windows Explorer
or any other file manager in an another Operating System.
<<< The short story ends
SO, Steps to implement:
0. you need a zip library (zlib is perfectly fine) - we have it already.
1. Open the archive (let's say "myWordDoc.docx") and scan for the existence of a file named "preview.emf" (of course, the name varies with the format)
2. If the file from point 2. exits, then extract this file in a memory stream
3. In very rare cases
(OLECF formats) you need to skip "some data" (Name=Value properties) and do a simple sequential search for the signature - don't worry, the signature is the PNG standard signature which cannot "happen" before.
...and now you have in your memory stream the thumbnail in a known, standard format (PNG, BMP, JPG or EMF). It is just the extraction of a file from a zip.