![]() |
Peter Marklund's Home |
Rails Recipe: HTML Validation
In this howto I'll show a simple approach to HTML validation that I use in my current Rails application. For me, HTML validation is a way to achieve wide browser compatibility, and to do a baseline check for correct rendering and UI brokenness. By putting ampersands and less-than signs in my test fixtures I can use HTML validation tests to check that I haven't forgotten any HTML quoting in my templates, i.e. that I haven't forgotten to use the following constructs:
<%=h some_variable %> <%= link_to h(some_variable) ... %>
Two common tools for HTML validation are the W3C Validator and Tidy and since I've found them to be complementary I've decided to use both. Tidy warns about empty tags which the W3C validator doesn't. On the other hand Tidy sometimes misses obvious errors such as missing paragraph end tags.
The approach that I came up with for HTML validation was to do it in an after_filter, but only when tests are run, so I added the folling to my test_helper.rb:
# HTML validate the response of all requests require File.join(File.dirname(__FILE__), '..', 'app', 'controllers', 'application') class ApplicationController after_filter :assert_valid_markup def status_code @response.headers['Status'][0,3].to_i end def assert_valid_markup return if RAILS_ENV != 'test' return if !(status_code == 200 && @response.headers['Content-Type'] =~ /text\/html/i && @response.body =~ /<html/i) assert_tidy # Going to the W3C validator over HTTP is a bit slow so we make this optional return if !ENV['HTML_VALIDATE'] assert_w3c_validates end def assert_tidy tidy = RailsTidy.tidy_factory tidy.clean(@response.body) unless tidy.errors.size.zero? message = ("-" * 40) + $/ i = 1 @response.body.each do |line| message << sprintf("%4u %s", i, line) i += 1 end message << ("-" * 40) + $/ message << tidy.errors.join($/) end raise "Tidy detected html errors in response body: #{$/} #{message}" unless tidy.errors.size.zero? tidy.release end def assert_w3c_validates require 'net/http' print "Querying W3C XHTML validator ... " response = Net::HTTP.start('validator.w3.org') do |w3c| query = 'fragment=' + CGI.escape(@response.body) + '&output=xml' w3c.post2('/check', query) end raise response.body if response['x-w3c-validator-status'] != 'Valid' print response['x-w3c-validator-status'] end end
As you can see from the code above, I only do the time consuming HTTP request to the W3C validator if the environment variable HTML_VALIDATE is set. This way I can easily turn off W3C validation, the obvious risk here being that it always stays turned off. Possible solutions include running the tests with full HTML validation nightly and to install the W3C validator locally.
In the code above you can also see that I use the excellent assert_tidy command from the RailsTidy plugin, so installing that plugin along with the Tidy library itself is a prerequisite for the code to work.
Great tools that I use for manual HTML validation include the Web Developer Extension for Firefox with its W3C validation capability for local HTML and the Safari Tidy plugin that allows you to see for every loaded page any Tidy errors and warnings.
When selecting a DOCTYPE to validate against I was choosing between XHTML 1.0 strict and transitional and I was convinced by certain experts that strict was the way to go.
Doing HTML validation is not without its frustrations of course. For example I had to work around the fact that in XHTML 1.0 strict, form elements (input, select etc.) need to be inside a p, div, or fieldset tag. Also, Tidy requires tables to have the summary attribute. Those are just small annoyances though and I haven't come across any bigger stumbling blocks yet. All in all I'm very happy about my validation efforts and I have a lot more confidence in the UI of my Rails application now that I'm validating its markup automatically in my controller and integration tests.