Saturday, December 24, 2011

about:cache

Going to about:cache in Firefox is a good way to get at the HTML being transmitted by a web site if you are struggling with a web site that is trying to block this content. (Ctrl-U normally displays the HTML source in Firefox. When this doesn't work, it's time to get mad!)

Click on "List Cache Entries" to go to about:cache?device=disk to see a list of cached stuff. Hey, there is a lot of noise here, huh? You can trim it down by clearing your cache and then immediately going to the page you wish to scrape and then immediately back to "List Cache Entries." Clear your cache in Firefox this way:

Tools > Options > Network > Clear Now

 
 

At any one line item at "List Cache Entries" one may click through to a page summary which will often expose a path to the cache like so:

C:\Users\whatever\AppData\Local\Mozilla\Firefox\Profiles\whatever\Cache

 
 

That said, an HTML scrape of the page should just run along the right side of the details. I use headspring.com in the image below as an example, however it was yfrog.com that inspired me to find a way around being blocked when I attempt to view HTML source.

Addendum 8/20/2015: I am commenting out http://a.yfrog.com/img857/6117/bg96.gif which yfrog has seen fit to replace with some sort of iTunes advertisement. I wish I had not started up my blog hosting images with these clowns.

Addendum 3/25/2014: Tools > Options > Network > Clear Now ...above should have "Advanced" between the "Options" and "Network." Maybe Firefox was different when I wrote this blog posting years ago. Maybe, alternatively, I just made a mistake.

No comments:

Post a Comment