Monthly Archives: April 2009

Ultrafast UTF-8 decoder by Bjoern Hoehrmann

I believe this is still getting tested by several parties, but it’s obviously a highly optimized implementation of a UTF-8 decoder. Bjoern Hoehrmann released his Flexible and Economical UTF-8 Decoder recently, check it out: // Copyright (c) 2008-2009 Bjoern Hoehrmann … Continue reading

Posted in Unicode | Tagged , | Leave a comment

Unicode security attacks and test cases – fuzzing with Unicode

When it comes to fuzzing parsers, protocols, and other software, I want the fuzzer to be capable of producing tests specific to Unicode. Here’s what it should do at a minimum: Generate half a surrogate pair in UTF-8 or UTF-16 … Continue reading

Posted in Unicode, security, testing | Tagged , | Leave a comment

Unicode security attacks and test cases – Normalization expansion for buffer overflows

Normalization, like casing operations, can cause changes to the number of characters and bytes in a string. In testing software, I want to know how to get the most bang for my buck – in other words, what’s the minimal … Continue reading

Posted in Unicode, software, testing | Tagged , | Leave a comment