Generating confusable, lookalike strings

The Unicode Consortium released a utility to generate confusable strings quite a while ago. Since I've seen people trying to create similar tools themselves recently, I thought it might be worth mentioning.

In case you haven't received the memo about confusables, also known as homoglyphs, lookalikes, and spoofs - they are characters that visually resemble or are indistinguishable from another character. You can read more about it here or virtually any other place on the Web by searching for some of these terms. For example the following two characters are visually similar and confusing:

FF21 ; 0041 ; SA # ( A → A ) FULLWIDTH LATIN CAPITAL LETTER A → LATIN CAPITAL LETTER A

Sometimes during penetration testing, we want to bypass profanity filters, spoof URLs, spoof email addresses, or perform other tasks. Being able to generate lookalike strings can be quite useful in these cases, but of course is not the only method required. If you require such capability, then go check out the Unicode Consortium's utility at http://unicode.org/cldr/utility/confusables.jsp, but please don't share this link with the bad guys.