Wednesday, 28 October 2009
When people say, in their delightfully sneaky blog posts, "But what's this? Here's a copy I cached earlier" what is that they're actually doing?
I'm curious to know about the different methods used and if there's a 'best' way of doing this. I suppose it would have to be something that also doesn't permit tampering with the source code to falsify the webpage.
1. Rely on Google cache
Probably unreliable (and websites can override Google's webcrawling robots) but at least search terms are nicely highlighted. I don't think this is easily falsifiable.
2. File / Save As... / Web archive, single file (.mht)
The option presented to me by MSIE - I think I may have used this for "working offline" but not with any particular competence. Is this what people are doing? Does it save an entire website or just the page you're on? I don't think this is easily falsifiable either.
3. Save the html code and regurgitate as a page later on
View / Source gives a small notepad file (which can be saved as .htm which can then be opened for editing in notepad, or in any browser as a webpage) with all the text needed to recreate the page. Images need to be saved later. Very very falsifiable.
4. Wait for the Wayback Archive to do the work for you
Wayback archives a lot of pages and they seem to appear six months after the page was live so changes might be harder to find depending on how many 'impressions' the Archive makes of the page, unless you remember the date on which the information you want to record was available. Doesn't seem to be falsifiable.
5. Take a screen shot
Press the button marked PrtSc (or something similar) and a copy of the entire visible screen is pasted to the clipboard. Paste (Ctrl V) this into Paint or other image editing software to select the relevant bit and save as a .bmp (or .jpeg etc). Probably quite fiddly to falsify the picture of words in Paint but might be doable in other software.
6. Something clever on Firefox
I haven't used it for a while but I think there was a gadget which helped with cacheing pages.
This post is all about creating copies of web pages but for more on 'finding old web pages' go here http://www.searchengineshowdown.com/others/archive.shtml