What does the javascript tab character do?
Some context: I was assigned on a pentest and found an application that let me place my own links in a tag's href attribute. As expected, all strange values like [removed] were correctly filtered by an XSS filter, however I discovered that you could bypass this filter by injecting a TAB character in the middle of the protocol specification like such:
Now what I would like to know is, why do browsers accept this as a valid javascript URL which will happily execute code whereas other characters like SPACE character are not allowed? Is there a historic reason for them allowing this strange format?
Note: Tested on Chrome, for those that like to test it, it also works with [removed] like so:
I found an old discussion that you might find interesting for understanding the possible historical reasons behind this choice.
That's a 19-year-old bug in Mozilla. The problem was that one website was not working as expected, because Mozilla didn't strip tab characters inside the URL in a link. The page worked as expected in Internet Explorer, which apparently ignored the javascript tab character. Tabs are often used for indentation in HTML files, so sometimes you can expect a few tabs after a new line. Somebody cited a IETF standard suggesting that "whitespace should be ignored when extracting the URI". However, others were not fully convinced that removing all whitespace characters would be a good idea, because sometimes you might run across URIs with unencoded spaces (for example: https://www.example.com/path with spaces/), even though that would be wrong, at least according to current standards. Therefore they decided to just add tabs to the list of removed characters (carriage-return and line-feed characters were already being removed). Note though that spaces are allowed, and ignored, when they are at the beginning or at the end of the URI (example: ).
So I suppose the historical reason for this choice is that they wanted to make sure the following code would work:
However they did not check exactly where the spaces or tabs were in the URL, they just decided to keep the spaces and remove the tabs. As a result, the first example doesn't work if you use spaces for indentation, and tab characters can be included anywhere in the URL without affecting anything (so even javascript will be ok).