How to use Google's internet caching to your advantage

November 14, 2012    google disprot internet caching search tricks tutorial

You’ve run into this problem: you search for something on the supposedly all-powerful Google search, click the link, but it’s dead. Nothing’s there. Not even a speck of information.

But Google said there was!

I ran into this exact problem for an assignment this week. We were supposed to use the Disordered Protein database DisProt and use the PONDR-FIT algorithm to find a protein that is predicted to be disordered, but is not in the DisProt database. However, the DisProt servers went down as of noon yesterday. With the assignment due today, what are we to do? Well, the due date was pushed back but I figured out how to search Disprot even though it was down, and learned a bit about Google’s internet caching (aka their plan to save all the knowledge ever publicly posted to the web.) in the process.

Viewing cached pages: step-by-step

  1. Do a Google search as usual
  2. Hover your cursor over the result you want, and you should see two arrowheads pointing to the right: » Click on them.
  3. A preview of your page should appear on the right.

Below is an example:

How to view cached pages

Turn on “Verbatim” mode to narrow your results

But what if you need to validate negative results, as we did for our assignment? Use “Verbatim” mode, which will stop auto-correcting your spelling to get you more result hits.

Without “Verbatim” mode

Without “Verbatim” mode, my “disprot oct4” query gets mangled to “disport oct4” and I get lyrics for some weird band I’ve never heard of.

To turn on “Verbatim” mode:

  1. Do a Google search as usual
  2. Click “Search Tools” in the bar below the search box
  3. Click “All results” (you want to filter so you don’t get all results)
  4. Click “Verbatim”

A visual tutorial is here:

With “Verbatim” mode

With “Verbatim” mode on, I get totally weird results which means that “oct4” is indeed nowhere to be found on DisProt!

Final note

Someone may have removed content from the internet for a reason such as copyright infringement. Please respect the owner of the intellectual property and do not use this method to crawl for copyrighted data.



comments powered by Disqus