Friday, August 31, 2012

Hosting user-generated content is a very dangerous thing

Michal Zalewski of the security department Google: hosting user-generated content is a very dangerous thing

Google has been doing this for many years and has accumulated a lot of experience from which to draw definite conclusions. The main conclusion that the voice of Michael, - hosting user-generated content is a very dangerous business.

Historically, all browsers and browser plug-ins are designed in such a way as to show multiple types of content, ignoring any errors on the website. In the days of static HTML and simple Web applications, this approach was normal, because all the content was controlled by the webmaster.

That all changed in the mid-2000s, when I discovered a new problem: a clever hacker could manipulate the browser, slipping him at first safe images and documents in formats HTML, Java or Flash - and get an opportunity to execute malicious scripts in the context of an application that displays the documents (cross site scripting).

Over the last few years web browsers gradually improve. For example, the browser vendors have begun to impose restrictions on various types of images and unknown MIME. However, web standards provide ways to circumvent this protection, such as ignoring any MIME type when downloading content via the object, embed or applet - it's much more difficult to correct, and such actions will lead to the emergence of vulnerabilities similar to the bug GIFAR.

Google Information Security Department in these years was actively involved in the investigation of many vulnerabilities associated with the content is displayed. In reality, many methods of dealing with them were first implemented in Chrome. Unfortunately, the problem has not been solved so far: for each closed holes researchers soon find a new way to exploit it, or find a new vulnerability in the browser. Two recent examples - vulnerability Byte Order Mark (BOM) and attack MHTML, which until now have been successfully implemented in the Internet.

For a while, the information security department pinned his hopes on Google content inspection prior to execution, but over time they realized that it is impossible to foresee all. For example, a hacker demonstrated Alexander Dobkin Flash-applet using only the letters of the alphabet and numbers, and one person was able to design a Google image into which you can embed text commands.

In the end, the Google Security Team found that UGC should necessarily be stored on a separate stand-alone domain, Google is usually. In this "sandbox" files virtually no risk to the Web application, and authentication cookies too safe.

With non-public user documents is more complicated. In Web applications, Google uses three strategies, depending on the level of risk:

  •     In high-risk situations (eg, documents with a high risk of leakage in the open access URL) Google could combine a scheme using tokens in the URL, with short, unique cookies for each document issued for a particular subdomain on This mechanism, known internally under the name Google FileComp, helps protect against the most critical applications Google.

  •     In other situations, when the risk is smaller (embedded images), you can restrict the output URL, linked to the specific user.

  •     In low-risk situations, Google provides a globally valid, the long-lived URL.

Research in the field of security of web browsers are continuing, and the situation is changing rapidly. Michal Zalewski said that the Google Security monitors news and reacts quickly, if necessary.

No comments:

Post a Comment