URL Testing

URLs are a cornerstone protocol of the Internet and the Web, but they are often misunderstood, occasionally abused, and quite often manipulated during security testing.  I've put up some Web pages to test URL parsing including one that works in a more live-view sort of way. I've also compiled over 500 test cases into JSON format from a number of sources including +WebKit+Julian Reschke, and +Eduardo Vela, as well as myself, which were improved by +Michael Smith and +Anne van Kesteren.

URLs have been a hot topic for quite a while, and as a co-chair of the IETF's IRI Working Group, I witnessed some of the conversations around the specs.  The Internationalized Resource Identifier specifications were intended to define how the RFC 3986 URIs could be made to include Unicode primarily, as well as other character encodings for legacy purposes.

The URL test page at http://www.lookout.net/test/url/ includes a few links, and the code and test cases can be found at https://github.com/cweb/url-testing.

1. A URL Live Viewer
A page that dynamically displays URL components as parsed by the browser's DOM, and by the URL.js prototype implementation.  The live view idea came from +Anne van Kesteren's live URL DOM viewer page which I lost the link to.

2. A URL test runner
A page that runs 500+ URL test cases through the testharness.js hosted by W3C.

3. Test URL parsing in DOM versus HTTP GET requests
Run all tests from urls-local.json using testharness.js to compare the Web browser's DOM properties with the resultant HTTP request's path and hostname parts.  The value of this test scenario is that we can compare the results of the HTTP GET against the browser's DOM properties to detect when URL components differ between the two.
This test is more complicated because it has the following server-side requirements:
  • mod_rewrite configured to proxy all requests for a pre-determined hostname pattern:
# Redirect everything that includes urltest.lookout.net in the hostname for URL testing
RewriteCond %{HTTP_HOST} ^.*urltest\.lookout\.net$ [NC]
RewriteRule ^.*$ /cgi-bin/httpreq.pl
With the above RewriteCond, each test case must point to a URL that includes urltest.lookout.net in the hostname. The RewriteRule will send the request to /cgi-bin/httpreq.pl, a CGI Perl script which will return the HTTP GET request's Host header value, and GET path. These values are returned as javascript variables to be used in evaluating how the URL was parsed. For example, the following HTTP request:
GET /foo/bar HTTP/1.1
Host: foo.urltest.lookout.net
would return:
var pathname = "/foo/bar"
var hostname = "foo.urltest.lookout.net"
  • DNS wildcard record for the host For this to work, the DNS must be setup with a wildcard A record so that requests to *.lookout.net all resolve to the same IP address

http://www.lookout.net/test/url/test-url.html for "A URL Live Viewer" is 404. I guess you want to that to be http://www.lookout.net/test/url/url-liveview.html

By the way, great stuff man :)

If only there was some tool that could watch all of the network requests without regard to the client or server technology. Some sort of "proxy" maybe? Of course, it would need to be scriptable or extensible to enable injection of the desired logic. Ideally it would be based on a technology which you/Casaba has experience with? Hrm... I bet you get where I'm going with this. :-)

Bah, that's what I get for cramming in site changes and a blog post before bedtime, thanks for the report : -)

Hmmm what on Earth could you be talking about? :-P At one point I was using Fiddler for this! And it filled the need perfectly well in my personal lab, I even found some interesting bugs. Currently I want the testing to be a server-side solution - it could be done easily with an IIS HTTP module, but I'm running Apache and thinking mod_rewrite or an Apache module would be required.