Test Environment:

This page sets the UTF-8 charset declaration in both the HTTP and HTML Content-Type.

Goal:

Test how URL components are normalized when they contain Unicode characters.

Description:

Test Unicode normalization using some of the character sequences from Unicode Standard Annex 15 "Unicode Normalization Forms" and others from RFC3197. From TR15 use a Singleton from Figure 3 - U+212B which normalizes to U+00C5 Å under NFC, and U+0041 U+030A Å under NFD. Also use multiple combining marks from Figure 5, U+10EB U+0323 ძ̣, and the sequence U+1E9B U+0323 ẛ̣ from Figure 6 Compatibility Composites. Through those few tests we can test for each of the four normalization forms.

Test Cases:

Test 1: U+212B in the path, query, and fragment.

http://example.com/Å/?Å#Å

Test 2: U+1E0B U+0323 in the path, query, and fragment

http://example.com/ḍ̇/?ḍ̇#ḍ̇

Test 3: U+1E9B U+0323 in the path, query, and fragment

http://example.com/ẛ̣/?ẛ̣#ẛ̣