Sunday 12 December 2010

Can Google really be used as a proxy server - to avoid detection?

I've seen it mentioned in various places, that Google (and other search engines) could be used as a pseudo proxy-server, to avoid content security solutions, or to avoid direct contact with a target website.

Why might someone want to do this?
  1. To bypass filtering controls such as URL-based web filters, which are deployed internally by many companies to enforce acceptable usage policies.
  2. To footprint a site or organization, before an attack, without contacting the site directly or leaving any traces of the attackers IP address in the sites http or firewall logs.
  3. As a way to obfuscate direct URL attacks such as RFI and LFI, directory traversal, SQL injection etc, again without leaving any traces of your IP address in logs.
I.e. for privacy or hiding malicious intent.

Do not use these techniques for malicious purposes, this article is for education only.

There are three methods that I have explored which we will look at here.
  • Google cache
  • Google translation service
  • Google wireless transcoder
I will briefly discuss the limitations of these techniques. I use www.bbc.co.uk for the examples in this article - purely as an example.


Google's cache

So, when you search for a site in Google, you can either go to the site, or view the content that Google cached from the site the last time Googlebot was there.

All very well and good, and certainly lots of the content does come from the Google cache, rather than the original server.

Your content could be accessed more directly with a URL such as:

http://webcache.googleusercontent.com/search?q=cache:www.bbc.co.uk

However, depending on how much Google has cached (which is variable and site-dependent) not everything comes from cache. For example images and such often come from the original sites themselves.

So this would likely trigger logging and content security solutions. (Not exactly "low profile".)

One way to avoid direct contact with the site would be to add a dummy host entry to the hosts file on your system, to avoid ever going to www.bbc.co.uk directly (in our example):

127.0.0.1       localhost
127.0.1.1       backtrack bt
127.0.2.1       www.bbc.co.uk


The cache content is static, so not possible to use for URL attacks as far as I can see.


Google's translation service

The Google translation service is pretty handy if you want to view a foreign language site in your own language.

This feature could also be misused as a proxy. In this example we translate an already English site from Korean to English.

We would choose a language such as Korean, Japanese etc, so that there are no substitutions for our English content. (The English text is left as is.)

http://translate.google.com/translate?sl=ko&tl=en&u=www.bbc.co.uk

i.e. (source language) sl = ko, (translated language) tl = en, (URL) u = www.bbc.co.uk

Obviously, as with using the caching example above, this method can remove some content, such as video from the pages.

Unfortunately images will still often be referred, and still hosted on the original site, so could trigger logging and content security, so adding a host entry (as above) would prevent direct contact.


Google's wireless transcoder

Google has a wireless transcoder, to reduce web content size in preparation for delivery to small wireless devices such as phones, iPhones and Blackberries.

This application is accessible here http://google.com/gwt/n

This is a good one if you are just after some text from a site, but the service can also shrink images for some sites (so these shrunken images will come from Google rather than the original site).

http://google.com/gwt/x?u=www.bbc.co.uk

The content is very cut-down (and the structure significantly changed as a result) but it all comes from Google rather than the target site, which could be an advantage.


Is it possible to pass parameters?

So, is it possible to pass parameters in a URL using these methods?

Google cache - nope

Google translation service - an example search on ebay:

http://translate.google.com/translate?sl=ko&tl=en&u=http://books.shop.ebay.co.uk/Non-Fiction-/171243/i.html?_nkw=ceh&_catref=1&_fln=1

Google wireless transcoder - an example search on ebay:

http://google.com/gwt/x?u=http://books.shop.ebay.co.uk/Non-Fiction-/171243/i.html?_nkw=ceh&_catref=1&_fln=1

Which would suggest that URL attacks may be possible using this method.


Summary

These are interesting techniques but not particularly effective in my opinion. If you are looking for privacy it would be far more effective to use an anonymous web proxy, or use TOR networking (or both).

It may be possible to use these techniques to bypass some content security solutions, but there are mitigations.


Mitigations

It may be possible to block Google translation, cache, and wireless transcoder on a company content security solution. This may go some way to limiting this type of obfuscation, but would cause some functionality limitations.



How to get information from Google to pursue and track criminals

Of course, if you are thinking of using these techniques to aid illegal techniques such URL-based attacks, then don't.

Remember that law enforcement frequently request and obtain log information from companies like Google (though it is unclear how much of this information is actively logged).

If you have additional questions about obtaining legal information from Google, then you can contact them at
legal-support <-at-> google <-dot-> com

Subpoena and legal requests could be sent to:

Attention: Custodian of Records
Google, Inc.
1600 Amphitheatre Parkway
Mountain View
CA 94043
US

6 comments: