Test Environment:

This page sets the iso-8859-7 charset declaration in both the HTTP and HTML Content-Type.

Goal:

Hyperlinks are tested for how they're displayed in the status bar, and how they're generated on the wire. Is the page-encoding maintained for the display? Or is the reference converted to UTF-8? Are sequences checked for "UTF8ness" during display?

Description:

Although the page encoding is iso-8859-7, the hyperlinks contain either a raw byte sequence < EF BC A1 > or a percent-encoded sequence %EF%BC%A1 that would also be considered valid in the UTF-8 encoding, referred to as "UTF8ness" herein. This byte sequence decoded as UTF-8 would reprepresent the U+FF21 FULLWIDTH LATIN CAPITAL LETTER A. The purpose is to test if browsers will do some sort of "UTF8ness" checking on the hyperlink, decoding the byte sequence as a UTF8 character representation before presenting it for display.

The tests also include either a raw byte < FC > or the percent-encoded %FC which is equivalent to the Unicode U+03CC GREEK SMALL LETTER OMICRON WITH TONOS.

Test Cases:

Test 1: Percent-encoded UTF-8ness in the path, percent-encoded %FC in query

http://www.example.com/%EF%BC%A1/?D%FCrst

Test 2: Percent-encoded %FC in the path

http://www.example.com/D%FCrst/

Test 3: Raw byte 0xFC in path, UTF-8ness raw bytes 0xEF 0xBC 0xA1 in the query string value

http://www.example.com/Dürst/?A

Test 4: Raw byte 0xFC with percent-encoded %FC in path

http://www.example.com/Dü%FCrst/