Why does HTML forward Slash?
In OWASP recommendations regarding escaping untrusted input for HTML element content, they list the following:
& --> &
< --> <
> --> >
" --> "
' --> ' ' not recommended because it's not in the HTML spec (See: section 24.4.1) ' is in the XML and XHTML specs.
/ --> / forward slash is included as it helps end an HTML entity
What is the purpose of including / in there? Indeed, / is part of the ending entity, but since we're already escaping < and>
To understand HTML forward slash, know that back before HTML 5 came with a standard parsing algorithm, HTML 4 was defined based on SGML. And its SGML-based syntax had features that differed in terms of browser support and differed in terms of people being aware of them.
One of the more obscure features that you'll be interested in, for the purposes of this question, is Null End Tag (NET). Have a look at the following code for an HTML page:
"http://www.w3.org/TR/html4/strict.dtd">
If you want, try putting it through an HTML validator. NET specifies that code like is parsed into the same thing you'd get from text. From here, you may be able to see why it's a good idea to escape the / character when interpolating user-provided data into markup. A / could potentially end up in one of these NET constructions and be interpreted as the end tag, instead of a literal solidus. Sure no browser now understands the NET syntax, but it's part of the spec, so it's prudent to account for it.