Saving HTML pages with images

All non-XnView related: softwares, formats, imaging, photography...

Moderators: XnTriq, xnview

JohnL
Posts: 33
Joined: Tue Dec 20, 2005 11:10 pm
Location: The Quantocks - UK

Saving HTML pages with images

Post by JohnL » Mon Jan 09, 2006 6:02 pm

Hi guys I dont know wether this will be allowed but after the response re saving your desktop I am pushing the boat out as I have a general queery re saving copies of web based articles which include images either as jpeg or other image based files.

As an avid reader of photographic based articles I often wish to 'keep' a copy for future reading so I save a copy. But approx' 90% of the saved files do not show the image which is part of the article. I just get a little red cross instead of the image.

I am running winxp sp2 and usually save as an 'html page' as 'western european windows' language and never had this problem when running good old win98.

I can also save as ' save target as ' but get the same result.

Could I have a setup problem or is there an easy answer.

If this is outside the remit of this forum I apologise but I am getting pretty frustrated

regards

JohnL

User avatar
Drahken
Posts: 884
Joined: Sun Apr 10, 2005 4:29 pm

Post by Drahken » Mon Jan 09, 2006 7:37 pm

That's the kind of thing this section is for.

When you save the page, there should be an option for "html only", "web page complete" and "text only", you want the "complete" one. The complete one saves all the images and such as well, the html only one just saves the html.

Another thing to check is this: It might be saving the images, but just screwing up and not telling the saved page where the saved images are located. To see if this is the case, look in the folder you saved the page at. If it saved the images, there should be a file called "some-article.html" (the page you saved) and a folder called "some-article files" (the name of the folder should match the name of the file). If it's there, then check and see what (if any) images are in it. If all the images are there, then you need to edit the html page you saved in order to make them appear.
Oh the feuhrer, oh the feuhrer, oh the feuhrer's nipples bonk!

JohnL
Posts: 33
Joined: Tue Dec 20, 2005 11:10 pm
Location: The Quantocks - UK

Post by JohnL » Mon Jan 09, 2006 9:53 pm

hi Drahken

thats interesting, very interesting.

I only have 2 'save as type' - ''html'' and ''text file'', there is no ''web page complete'' as I had previously in Win 98 !! and I emailed Microsoft some 2 years ago about that and had no reply ( no surprise there then )

I will look into the other suggestion shortly :)

thanks Drahken

JohnL

JohnL
Posts: 33
Joined: Tue Dec 20, 2005 11:10 pm
Location: The Quantocks - UK

Post by JohnL » Tue Jan 10, 2006 7:52 pm

hi again - if you are saying there should be 2 folders per saved item that is not the case on my system. just one folder per saved item.

I am still puzzled why I have only two ''save page as'' options which is where my problems lie I think :( :?

If anyone has a solution I would be grateful :)

JohnL

User avatar
Drahken
Posts: 884
Joined: Sun Apr 10, 2005 4:29 pm

Post by Drahken » Tue Jan 10, 2006 8:47 pm

There should only need to be 1 folder.

Open one of the saved pages, right click where an image should be, and click properties. Now look at the address of the image. Is it something likt "http://site.com/image.jpg" or is it more like "c:\site-folder\image.jpg"? It should point to the folder on your computer where the images and such were saved.

You might want to try changing browsers (which would be a good idea in any case). You could try firefox or opera, or my favorite in k-meleon.

I can't give you much help with internet explorer since I haven't used that PoS in years, but I found a couple sites that might help:
http://askit.uq.edu.au/itanswers/quikit/1_msiesav.html
http://hsc.unm.edu/wbc/copy_save.htm
What version of IE do you have? I tried saving a page with IE6 and got 4 options: HTML only, webpage complete, text only, web archive. When saving google's homepage using webpage complete, I wound up with a file titled "google.htm" and a folder titled "google_files", which contains the images and other related files (like stylesheets and javascript files).


Edit: I just remembered the part of your first post about using xp:sp2, which means that you must be using IE6. There is a well known bug in IE that only lets you save images as BMP files instead of JPG or GIF or whatever, I wonder if this may be related? As I recall, the way to clear up that bug was to clear out your cache folder. You could try doing that and see if it helps.
Oh the feuhrer, oh the feuhrer, oh the feuhrer's nipples bonk!

User avatar
robc
Posts: 164
Joined: Mon Nov 14, 2005 12:53 pm

Post by robc » Tue Jan 10, 2006 9:00 pm

Do you have a standard XP installation with IE 6, Outlook Express and all patches correctly installed? mht support should definitely be there in this case. On the other hand, if you have one folder per saved item then the "Web page, complete" is working, you should have

- savedpage.htm
- savedpage_files folder (note the _ )

where of course "savedpage" stands for the file name you provide when saving. When saving this way, the browser changes automatically all references to linked images on the web in the savedpage.htm file to local references to the saved images in the savedpage_files folder; often, when the page writer has a limited knowledge :evil: and uses images whose file names contain spaces, a problem arises: the link to the image in the htm file says, for example, "/savedpage_files/image with blanks.jpg", but the actual image in the folder has been correctly saved as "image%20with%20blanks.jpg", the "%20" being the code (hexadecimal) use to represent the space since URLs cannot contain spaces by definition... this of course causes the "file not found" problem which can be corrected by manually editing the file names and substituting the %20 (three characters) with a single space.

JohnL
Posts: 33
Joined: Tue Dec 20, 2005 11:10 pm
Location: The Quantocks - UK

Post by JohnL » Tue Jan 10, 2006 9:25 pm

this is a copy of the address ( URL ) taken from the file code of the 'properties' of a jpg image from an article I copied from the web where the image would not show as part of the article ie just a red cross.

file:///C:/Documents%20and%20Settings/John%20London/My%20Documents/My%20Jottings/Good%20Reading/300_brebrow_image_no1.jpg

I am using IE 6.0 and I definitely only have two ''save page as'' options

The option to check the BMP_JPG_GIF will have to be explored in a short while

thanks for the attention guys

JohnL

JohnL
Posts: 33
Joined: Tue Dec 20, 2005 11:10 pm
Location: The Quantocks - UK

Post by JohnL » Tue Jan 10, 2006 9:27 pm

I have removed the first part of the previous file ref 'cus it seemed to give the wrong impression of how the coding was written

:/Documents%20and%20Settings/John%20London/My%20Documents/My%20Jottings/Good%20Reading/300_brebrow_image_no1.jpg

JohnL

User avatar
Drahken
Posts: 884
Joined: Sun Apr 10, 2005 4:29 pm

Post by Drahken » Wed Jan 11, 2006 1:42 pm

Try editing that file and change all the %20 into spaces.
Oh the feuhrer, oh the feuhrer, oh the feuhrer's nipples bonk!

JohnL
Posts: 33
Joined: Tue Dec 20, 2005 11:10 pm
Location: The Quantocks - UK

Post by JohnL » Thu Jan 12, 2006 6:19 pm

I've tried evry way possible that I know to edit this but have failed miserably - again :(

JohnL

User avatar
helmut
Posts: 8153
Joined: Sun Oct 12, 2003 6:47 pm
Location: Frankfurt, Germany

Post by helmut » Thu Jan 12, 2006 9:27 pm

There must be a reason why you have no "Save complete webpage" in Internet Explorer 6.0. You should try and solve this problem. Once this works, you do not have to hazzle with changing Html or take other weird actions.

JohnL
Posts: 33
Joined: Tue Dec 20, 2005 11:10 pm
Location: The Quantocks - UK

Post by JohnL » Thu Jan 12, 2006 11:03 pm

I agree, but this facility has been unavailable since I bought the computer complete with Win XP.

Unfortunately no-one I have talked to can understand why it is missing nor how to get it back :(

JohnL

Xyzzy
Posts: 652
Joined: Tue Nov 23, 2004 10:17 pm
Location: Poland

Post by Xyzzy » Thu Jan 12, 2006 11:13 pm

Maybe this, found in Web:

"Cannot Save Web Page as a Web Archive File

When you attempt to save a Web page as a Web archive (.mht) file, "Web Archive (*.mht)" may be missing from the Save As Type box.

This behavior can occur if Microsoft Outlook Express 5 is not installed on your computer. The ability to save a Web page as a Web archive file is provided by the Inetcomm.dll file, which is installed by Outlook Express 5.
To resolve this behavior, install Outlook Express 5."

X.

User avatar
XnTriq
Moderator & Librarian
Posts: 5430
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

MHTML sans IE

Post by XnTriq » Fri Jan 13, 2006 3:20 am

:) Hi JohnL!
  • Mozilla Archive Format
    • maf.mozdev.org
      The MAF project is an archive extension that allows complete web pages to be saved in a single archive file. MAF stands for Mozilla Archive Format and the extension uses RDF to save page meta-data such as the original URL of the page and the date/time the page was put in the archive.
    • Firefox MAF Extension
      Allows complete web pages to be saved in a single archive file. Uses RDF to save page meta-data. It also allows pages to be saved in a seperate MHTML compatible format for interoperability with IE systems.
    • Support forum for users of Mozilla Archive Format
  • aspNetMHT
    A Complete MHT/MHTML Component for ASP.NET, Winforms, and Windows & Web Services
    • Online Demo
      This demo will allow you to convert any URL to a MHT document. Once you enter your URL, and click “Convert to MHT”, aspNetMHT will fetch the URL's contents, convert them to a MHT, compress to a ZIP archive, and allow you to download.
And finally, here's a link to even more links: Please take a look at the list of programs I've posted here. (You'll have to scroll down to the “Related shareware” section.)

User avatar
helmut
Posts: 8153
Joined: Sun Oct 12, 2003 6:47 pm
Location: Frankfurt, Germany

Post by helmut » Fri Jan 13, 2006 5:07 pm

I don't mean "Complete web page (*.mht)". What I mean is "Complete web page (*.html; *.htm)". The latter will save the web page and all the images contained in the original webpage. There must be a reason why this entry is not available in John's Internet Explorer 6.

Post Reply