Automating website screenshot thumbnail

Ask for help and post your question on how to use XnView Classic.

Moderators: XnTriq, helmut, xnview

April

Automating website screenshot thumbnail

Post by April »

I need to screenshot hundreds of websites, then reduce them to a thumbnail. XnView was recommended to me for doing this, and XnView is a wonderful start, but it still seems to take more human intervention than I was hoping for. I'm hoping I'm just missing the proper way to go about doing this, since I know so little about the software. Can someone who knows the software better point me in the right direction?

So far, it seems I need to open a web browser window with the website, then take a screen capture, then crop the image to remove the parts of the screenshot which are actually the browser (the back button, etc.). Then I need to resize, sharpen, resize, and save the image. I understand that I can automate the cropping, resizing, etc. but I don't see a way to automate the interaction with the web browser. It would also be really nice if my automation wouldn't break if I ever changed the dimensions of my browser window (ie by adding a new toolbar).
User avatar
helmut
Posts: 8705
Joined: Sun Oct 12, 2003 6:47 pm
Location: Frankfurt, Germany

Re: Automating website screenshot thumbnail

Post by helmut »

April wrote:I need to screenshot hundreds of websites, then reduce them to a thumbnail. XnView was recommended to me for doing this, and XnView is a wonderful start, but it still seems to take more human intervention than I was hoping for. I'm hoping I'm just missing the proper way to go about doing this, since I know so little about the software. Can someone who knows the software better point me in the right direction?

So far, it seems I need to open a web browser window with the website, then take a screen capture, then crop the image to remove the parts of the screenshot which are actually the browser (the back button, etc.). Then I need to resize, sharpen, resize, and save the image. I understand that I can automate the cropping, resizing, etc. but I don't see a way to automate the interaction with the web browser. It would also be really nice if my automation wouldn't break if I ever changed the dimensions of my browser window (ie by adding a new toolbar).
As you have written yourself XnView can automate the conversion part once you have a graphic. But XnView cannot automatically download Webpages and save these as graphic (perhaps a good idea).

To automate your first part, Macro Recorders come to my mind. These tools can record the actions you do with your mouse and keyboard and later play this back many times.

Simply search the web for terms like "Macro record mouse keyboard" and you should find various programs. When evaluating those tools I recommend to check some things:
- Can it really play-back what you record (e.g., I've experienced once a recorder which could not control MS Excel)
- Does it have a (simple) language to change the macros?
- Does it record relative coordinates?
- Does it have means to make sure that play-back works properly (check focus on Window, check focus on control, check sizes, ...)
- All in all: Does it fit all your needs?

I've some experience with Mouse and Key Recorder, which is Shareware. It does recording and play-back nicely and flexible (relative coordinates). Installation is not so nice (you have to be admin). Also, the macro language has limitations, creating subroutines is possible but a bit clumsy.

Two other ones I've tried out two years ago are Macro Magic and Aldo's Macro Recorder. But I do not remember too well what their limitations were. (I think for Macro Magic it was the fact that the whole programming was using mouse, not typing, and Aldo could not control all the programs I wanted, see above.) But this might have changed.
User avatar
helmut
Posts: 8705
Joined: Sun Oct 12, 2003 6:47 pm
Location: Frankfurt, Germany

Re: Automating website screenshot thumbnail

Post by helmut »

April wrote:... So far, it seems I need to open a web browser window with the website, then take a screen capture, then crop the image to remove the parts of the screenshot which are actually the browser (the back button, etc.). Then I need to resize, sharpen, resize, and save the image. I understand that I can automate the cropping, resizing, etc. but I don't see a way to automate the interaction with the web browser. It would also be really nice if my automation wouldn't break if I ever changed the dimensions of my browser window (ie by adding a new toolbar).
Perhaps another approach to automate the first part: There's tools like HTTrack (Freeware) which can download pages so that you can read them offline. Perhaps there is one tool which can eat many links and downloads and stores all the pages one after another.
tinguaro

Post by tinguaro »

Hello April,
there is a HTML Converter out there: Coolutils.com - Total HTML Converter. It´s not free and I did´nt test it yet, but sounds good.
Once you have converted the pages to jpg you can batch-resize them with XnV.
tinguaro
User avatar
XnTriq
Moderator & Librarian
Posts: 6402
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Post by XnTriq »

Look for a utility that lets you specify which “screen object” should be captured.
The portion of the screen your web browser displays the pages in would be such an object.Please come back and tell us what you ended up with!
fuzzydice
Posts: 1
Joined: Sun Nov 06, 2005 10:39 am

Post by fuzzydice »

Tacking onto XnTriq list i would add screenshotcaptor,though it doesn't have the in browser scroll fuction that Snagit has and is the only reason i haven't switched.Snagit has a cool browser plugin for firefox that will self scroll within the browser capturing the whole page in graphic form or text depending on your selection,very nice.Expensive though,like $40 bucks US,i believe.
User avatar
XnTriq
Moderator & Librarian
Posts: 6402
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Post by XnTriq »

The other day I was thinking of a solution that doesn't involve any third-party shareware.
Here's the deal:

Although I usually avoid MSIE I sometimes use it to save a web page as a “Web Archive” (a/k/a “Single File Web Page”).
Prior to version 1.80 XnView relied on Nikolay Raspopov's Xhtml plugin to generate thumbnails of such MHT/MHTML & EML files.
WhatsNew.txt wrote:XnView v1.80 (LIBFORMAT v4.47) 17/06/2005:
[...]
Added : Thumbnail for .htm/.html
[...]
ckit wrote:I can confirm this with XnView 1.80.1.
The XHTML.DLL is not present in the Plugins list in XnView, the DLL file is in "\XnView\Plugins" folder.

This plugin has no purpose anymore, XnView 1.80.1 can already display thumbnails for HTM and HTML files.
Now, what's weird about this is that I get thumbs of MHT/MHTML & EML but not of HTM/HTML files:

Image

... compared to the preview images created by Windows Explorer in thumbnail view mode:

Image

When I try to export the thumbnails by making a gallery (Tools -> Web Page...) I get an error message for all file types mentioned above: “Format of the file <?> could not be determined”.

Any Ideas?
User avatar
ckv
Posts: 786
Joined: Wed Feb 02, 2005 2:30 pm
Location: Glow

Post by ckv »

I believe that html & url thumbnails got axed after people started to ask why XnView is trying to take a connection to the internet while browsing. Not very smooth really... :?

Don't know for sure, but there should be some option what enables html thumbnails, but it's not named like "Show thumbnail for html files", what would be a great by the way. wink wink

Edit: MHT/MHTML?? Are these standard html file types?
XnView Tweak UI - Tool to customize your XnView beyond the regular XnView options.
UI-less Settings - Documentation of all the hidden settings in XnView.
XFAM - Tool to create and customize XnView file associations.
User avatar
XnTriq
Moderator & Librarian
Posts: 6402
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Post by XnTriq »

ckv — always pointing me in the right direction 8) : How to stop XnView connecting to Internet with .url files ?
I made my Firewall take care of that right away after installing v1.80.

Okay, that explains why I don't get previews of .htm/.html/.url files.
What about the fact that XnView displays thumbs for .mht/.mhtml/.eml but refuses to export them as a gallery?
IMHO these are two seperate issues, don't you think?
ckv wrote:Edit: MHT/MHTML?? Are these standard html file types?
Well, kinda sorta:
Jeff Atwood (Coding Horror: [url=http://www.codinghorror.com/blog/archives/000249.html]Building Mht Files from URLs revisited[/url]) wrote:“I know people assume the worst of Microsoft, but the MHTML format is actually based on RFC standard 2557, compliant Multipart MIME Message (MHTML web archive). So it's an actual Internet standard! Web Archive, a.k.a. MHTML, is a remarkably simple plain text format which looks a lot like (and is in fact almost exactly identical to) an email.”
Related links:Related shareware:
  • GoUpSoft
    • EZ Save MHT ($19.99)
      “EZ Save MHT is an Internet Explorer add-on that allows you to save any web page as a standard .MHT web archive file with a single click. WebArchives are complete web pages with images and all, but saved as a single, standalone file (.mht).
      Similar functionality comes already built into IE (from the Save as... dialog) but EZ Save MHT offers support for complex pages with multiple frames, scripts, Flash, and dynamic content. [...]”
    • IEToolKit ($19.99)
      “IEToolkit is an add-on for Internet Explorer, that adds a new toolbar with several useful features. It allows you to capture a full size screenshot of an entire web page (no scrolling needed) in .BMP or JPG format; save the complete page to .MHT format (web archive) or to save any element, including Flash, by simply dragging a target over it. [...]”
  • Spidersoft
    • WinMHT ($29.95)
      “WinMHT lets you save and conveniently organize Web pages onto your local hard disk. It lets you take a ‘snapshot’ of one or more Web pages and store them and all their associated resource files in a single, easy to use Web archive. (.mht file).
      WinMHT is the ideal tool for doing Web research. You can browse the Web and whenever you find an interesting page(s) that you would like to keep for future reference, simply save it to your computer in the topic folder of your choice.”
User avatar
ckv
Posts: 786
Joined: Wed Feb 02, 2005 2:30 pm
Location: Glow

Post by ckv »

XnTriq wrote:What about the fact that XnView displays thumbs for .mht/.mhtml/.eml but refuses to export them as a gallery?
IMHO these are two seperate issues, don't you think?
Yeah. The "Create web page" is limited only to image files (not sure about video files). But it would be great if it could also create thumbs for html/txt/pdf or any other filetype with using the filetype icon as thumbnail like in browser.

I made couple test files with EZ Save MHT and tested in win2k/winXP, but no matter what I do, I don't get thumbnails for these files in explorer or XnView? I am going blind here, but I guess that windows don't show thumbnails for MHTML files by default. And since XnView probably uses some component from windows to create thumbnails for HTML files, it won't show them either.

And thanks about the MHTML answer. I really appreciate it.
XnView Tweak UI - Tool to customize your XnView beyond the regular XnView options.
UI-less Settings - Documentation of all the hidden settings in XnView.
XFAM - Tool to create and customize XnView file associations.
User avatar
XnTriq
Moderator & Librarian
Posts: 6402
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Post by XnTriq »

ckv wrote:And since XnView probably uses some component from windows to create thumbnails for HTML files, it won't show them either.
Bingo: “You cannot view Web content files in Thumbnails view after you install Windows XP Service Pack 1 or Windows XP Service Pack 2” (MSKB Article #327833).
As I was poking around I found a .REG file on this fellow's homepage. It contains entries which are apparently not restored by re-registering shimgvw.dll (Shell Image View Control).
Obviously this DLL is also acting as the “Windows Picture and FAX Viewer” in XP.

Finally, here's some more potentially useful information regarding this matter:
Microsoft Windows XP Registry Guide — Chapter 5: Mapping Tweak UI — [url=http://www.microsoft.com/mspress/books/sampchap/6232.asp#118]Thumbnails[/url] wrote:The Thumbnails category controls the quality of thumbnails in Windows Explorer. Table 5-10 describes the values for Image Quality and Size. Create values that you don't see in the registry. The default value for ThumbnailQuality is 0x5A. The default value for ThumbnailSize is 0x60. Keep in mind that higher quality and larger thumbnails require more disk space, which is not usually a problem, but they also take longer to display. Changing the quality does not affect thumbnails that already exist on the file system.

Code: Select all

+---------------+------------------+-----------+-------------+
| Setting       | Name             | Type      | Data        |
+---------------+------------------+-----------+-------------+
| HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer    |
+---------------+------------------+-----------+-------------+
| Image Quality | ThumbnailQuality | REG_DWORD | 0x32 - 0x64 |
| Size (pixels) | ThumbnailSize    | REG_DWORD | 0x20 - 0xFF |
+---------------+------------------+-----------+-------------+
vbAccelerator — [url=http://www.vbaccelerator.com/home/NET/Code/Libraries/Shell_Projects/Thumbnail_Extraction/article.asp]Thumbnail Extraction Using the Shell[/url] wrote:The Shell provides an IExtractImage interface which allows you to obtain thumbnail images for any file which has a Shell extension that supports the interface. By default there are Shell extensions for images, most Office documents and folders; and other programs can install their own Shell extensions for other file types. This sample provides .NET code you can use to easily get hold of thumbnails using this technique.
Last edited by XnTriq on Tue Feb 26, 2008 9:00 pm, edited 1 time in total.
User avatar
XnTriq
Moderator & Librarian
Posts: 6402
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Post by XnTriq »

Code: Select all

Argument | Description                             | Default Value
---------+-----------------------------------------+-------------------------
/url     | Website url                             | Required
/in      | Text file with urls on each line        |
/out     | Output image file                       | webshot.jpg
/width   | Image width                             | Browser width
/height  | Image height                            | Browser height
/bwidth  | Browser width                           | Automatically determined
/bheight | Browser height                          | Automatically determined
/quality | Quality of output image (0-100)         | 100
/encoder | Image encoder (ie. png, gif, jpg, bmp)  | jpg
/wait    | Time to wait in ms after page is loaded | 0
-verbose | Displays browser status information     |
Important: You may need to install GdiPlus.dll (GDI+) for versions of Windows prior to XP. :!: