Advisory: BOM'ing Firefox's Javascript Interpreter

Damage: Filter evasion, cross-site scripting
: Insert Unicode byte order mark (BOM) U+FEFF into javascript statements to bypass filters.
Root Cause
: character absorption/swallowing
Product version:
Firefox 3.01 and earlier

Link to Mozilla advisory:

Well admittedly this one seems to have little exploitative value compared to some of the others.  But surely someone with more know-how than myself could find a useful exploit for it.

Firefox already knew about this since Dave Reed reported it in February and were working on the fix.  This behavior could lead to all sorts of nastiness, such as enabling cross-site scripting, bypassing or evading HTML filters and WAF's.  To get to the point, here's what's possible by injecting the Unicode BOM U+FEFF in the javascript: <a h[U+FEFF]ref="javas[U+FEFF]cript[U+FEFF](ale[U+FEFF]rt('onclick')">

This issue was found years ago in Firefox's HTML interpreter, but left hidden in the Javascript interpreter or maybe reintroduced later?  I'm not sure, but the current issue was only in Javascript, not HTML. The Unicode byte-order-mark (BOM) consists of the character code U+FEFF and is normally used at the start of a file to indicate to the parser the encoding form and byte order.

BytesEncoding Form
00 00 FE FFUTF-32, big-endian
FF FE 00 00UTF-32, little-endian
FE FFUTF-16, big-endian
FF FEUTF-16, little-endian

When the BOM sequence occurs in the middle of a file, we might expect it to change the meaning of the string.  In other words, we wouldn't expect the following to be ignored in valid Javascript: va[U+FEFF]r x = "x"; document.wr[U+FEFF]ite('ouch'); So yes it seems the above does become valid Javascript.

Maybe the BOM character is stripped prior to hitting the Interpreter, I'm not sure.  But the expected behavior would be an error condition.  The problem may be with the Unicode specification too.  Regarding U+FEFF handling when found in the middle of markup files, they say :
When designing a markup language or data protocol, the use of U+FEFF can be restricted to that of Byte Order Mark. In that case, any U+FEFF occurring in the middle of the file can be ignored, or treated as an error. [AF]

The part that says 'can be ignored' might be what's happening here.  But it seems like the Unicode handler is removing the U+FEFF before passing the content to the javascript interpreter.

Here's a link to the test case.