Peter Marklund's Home |
Rails QA: Watch Out for Duplication in your Fixtures
I had a long debugging session due to duplicated record keys in one of my fixture files. It turns out the YAML parser will not complain if you have duplicated record keys (the keys on the top level) or duplicated column keys. Eventhough the YAML specification says that map keys should be unique the YAML parser will happily overwrite any existing value for a key if it encounters that key again.
Under the motto of crashing early and avoiding silent failures I came up with the following unit test:
# Check for duplication in Fixture files, i.e. that: # # - keys of records are unique # - column names are unique # - id values are unique # # The idea is that this test can complent the syntax checking of the YAML # parser and help avoid debugging nightmares due to duplicated records or columns. # What the YAML parser will do when a record or column key is duplicated is it # will just use the last one and let that overwrite the earlier ones. # Before writing this script I tried using the Kwalify YAML validator but I # couldn't quite coerce it into doing what I wanted. class FixtureTest < Test::Unit::TestCase def test_fixtures fixture_file_paths.each do |file_path| initialize_variables(file_path) fixture_contents(file_path).each do |line| next if skip_line?(line) if is_record?(line) assert_record_not_dupe(line) elsif is_column?(line) assert_column_not_dupe(line) assert_id_not_dupe(line) if is_id?(line) end # End if statement end # End line loop end # End fixture file loop end private def initialize_variables(file_path) @file_path = file_path @record_keys = [] @column_keys = [] @ids = [] end def fixture_file_paths Dir["#{Test::Unit::TestCase.fixture_path}/**/*.yml"] end def fixture_contents(file_path) ERB.new(IO.read(file_path)).result end # Skip YAML directive, comments, and whitespace lines def skip_line?(line) line =~ /^\s*$/ || line =~ /^\s*#/ || line =~ /^---/ end # A record has no indentation (leading white space) def is_record?(line) line =~ /^\S/ end # A column line has some indentation (should be same as the first column) # and a colon def is_column?(line) return false if line !~ /^\s+[a-zA-Z_]+:/ # Looks like a column - check the indentation level as well indent_level = line[/^(\s+)/].length if @column_indent # If the indentation level is different from the first column this may # not be a column, it could be nested data of some sort return @column_indent == indent_level else # This is the first column so remember its indentation level @column_indent = indent_level return true end end def is_id?(line) column_key(line) == "id" end def column_key(line) line[/^\s+([^:]+):/, 1] end def assert_record_not_dupe(line) record_key = line[/^(?:- )?([^:]+):/, 1] assert !@record_keys.include?(record_key), "Record key #{record_key} in fixture file #{@file_path} " + "is duplicated on this line: #{line.chomp}" @record_keys << record_key @column_keys = [] @column_indent = nil end def assert_column_not_dupe(line) assert !@column_keys.include?(column_key(line)), "Column #{column_key(line)} for record #{@record_keys.last} is duplicated " + "in file #{@file_path} on this line: #{line.chomp}" @column_keys << column_key(line) end def assert_id_not_dupe(line) id_value = line[/^\s+id:\s*(\S+)/, 1] assert !@ids.include?(id_value), "Value for id column duplicated in file #{@file_path} on this " + "line: #{line.chomp}" @ids << id_value end end