25 January 2006
Getting DRYer through Metaprogramming

Ever since I was a kid, I’ve felt the need to make things. Crayons weren’t meant to color within the lines, they were meant to draw your own lines. And not drawing the same lines each time, but new lines that defined new things; perhaps similar things, but never a copy. Songs were fun to listen to, but the ones with multiple parts and voices sounded better, and improvisation was much more appealing. Banging out a new tune on a piano was more interesting to me than playing the notes on a page. And once I learned to use a hammer and saw, I could scavenge for scrap lumber and create a very cool set of treehouses way back in the woods. It was all about making new stuff.

Computers were just tons of fun. It was a natural fit - I took to programming ravenously, writing whatever I could think of writing, learning all I could along the way. When I started writing in C and found out how to make code libraries, I started building lots of reusable code. Libraries meant I never had to write the same thing twice, which had tremendous appeal to me. Of course sometimes I still copied my existing code and tweaked it a little to get slightly different functionality, but that was all part of programming in C.

Getting DRY

As I got older and wiser, I tried to more actively resist the calling of the quick tweak. Once I went commercial and I had to maintain reams of code, the special pain of trying to make sure all the duplicate-based tweaked code was updated with a change grew to be barely tolerable. So much time was wasted. Then I finally got a little DRY religion.

If you’re Ruby-savvy, you already know that DRY means Don’t Repeat Yourself:

No piece of information, be it a database or a program’s source code, should ever be duplicated. In a DRY environment, you will only have to change things once, as opposed to a non-DRY (WET) environment, where you have to do multiple changes in synchrony. - Wikipedia

This is Very Important Stuff to make part of your thinking. It can give you back months and years of productivity over your lifetime. And as I get older and look ahead at all the code I still want to write, I know DRY is right for me.

A Little Ruby Thrown In

Part of being productive when making new things is about making old things better. Programming is a wonderful environment for such things, since it is so easy to change code. You try something, start using it, think of ways it could be better, and repeat. And if you’re a good critic of your work, things really do get better.

In an earlier paper, I touched on a simple extention to Ruby that I found to do enumerations. A nice tight peice of code that did some very simple metaprogramming. It took a string of names, pulled each name into an assignment and evaluated it, adding a constant to the Ruby class:

# enum.rb (original)

class Object
  def self.enum(*args)    
    args.flatten.each_with_index do |const,i|
      class_eval %(#{const} = #{i})
    end
  end

  def self.bitwise_enum(*args)    
    args.flatten.each_with_index do |const,i|
      class_eval %(#{const} = #{2**i})
    end
  end
end

As part of Object, I could now do an enum in any class and have a set of unique constants to use.

For those curious about the concept of Metaprogramming, what it’s about is code that writes code. If you’re using statically-compiled languages, you may not have come across the term much. But you virtually do it all the time (in a sense) in all but the simplest programs, just not as explicitly as in a dynamic language such as Ruby. When the arguments you enter into a program affect how the program will choose between different processing paths, you’re effectively using the most rudimentary form of metaprogramming. True, you’re not extending the language, but it takes a lot more mechanism to do that. In Java for instance, if you’re really intrepid you can create a program that creates code that is executed by an interpreter built into the program. It’s not easy, and can be a bear to get right, but it can be done. In Ruby however, it seems to almost be an expectation.

Anyway, back to enums. A bit later I found myself in situations where I needed to grow the enumerated set or leave gaps, and add new constants within those gaps. I needed to pick up from an existing enumeration and keep going. A simple extension did the job:

# enum.rb (accomodates holes and direct placements)

class Object
  def self.enum(*args)
    offset = 0
    bias = 0
    args.flatten.each_with_index do |const,i|
      case const
      when /^\+(\d+)$/
        offset = $1.to_i
        bias += 1
      when /^(\=)?(\d+)$/
        offset = $2.to_i - i
      else
        class_eval %(#{const} = #{i+offset-bias})
      end
    end
  end

  def self.bitwise_enum(*args)
    offset = 0
    bias = 0
    args.flatten.each_with_index do |const,i|
      case const
      when /^\+(\d+)$/
        offset = $1.to_i
        bias += 1
      when /^(\=)?(\d+)$/
        offset = $2.to_i - i
      else
        class_eval %(#{const} = #{2**(i+offset-bias)})
      end
    end
  end
end

Now I could embed numbers to the incoming strings and adjust the assignments by adding a hole relative to the current numbering (using +number) or absolutely (using a number or =number). The absolute adjustment could also be used later to fill in previously unfilled holes. The code got a bit bigger and was mostly similar, but this certainly was not too a big programming faux pas, pragmatically speaking. It was just slightly DAMP. But it did bother me a bit.

After using all this for a while, I noticed my code becoming littered with a lot of cruft; the constants had to be prefixed with the name of the class that did the enum when I used them in other classes. Of course, the Ruby way to get rid of this is to create a Module that held the constants and mix it into the classes that used them. Unfortunately, Modules needed instance level definitions, so the enum class grew.

# enum.rb (handles Modules)

class Object
  def self.enum(*args)
    offset = 0
    bias = 0
    args.flatten.each_with_index do |const,i|
      case const
      when /^\+(\d+)$/
        offset = $1.to_i
        bias += 1
      when /^(\=)?(\d+)$/
        offset = $2.to_i - i
      else
        class_eval %(#{const} = #{i+offset-bias})
      end
    end
  end

  def self.bitwise_enum(*args)
    offset = 0
    bias = 0
    args.flatten.each_with_index do |const,i|
      case const
      when /^\+(\d+)$/
        offset = $1.to_i
        bias += 1
      when /^(\=)?(\d+)$/
        offset = $2.to_i - i
      else
        class_eval %(#{const} = #{2**(i+offset-bias)})
      end
    end
  end

end

class Module

  def enum(*args)
    offset = 0
    bias = 0
    args.flatten.each_with_index do |const,i|
      case const
      when /^\+(\d+)$/
        offset = $1.to_i
        bias += 1
      when /^(\=)?(\d+)$/
        offset = $2.to_i - i
      else
        class_eval %(#{const} = #{i+offset-bias})
      end
    end
  end

  def bitwise_enum(*args)
    offset = 0
    bias = 0
    args.flatten.each_with_index do |const,i|
      case const
      when /^\+(\d+)$/
        offset = $1.to_i
        bias += 1
      when /^(\=)?(\d+)$/
        offset = $2.to_i - i
      else
        class_eval %(#{const} = #{2**(i+offset-bias)})
      end
    end
  end
end

Yikes! Now things were getting out of hand. Virtually the same code repeated four times. The code wasn’t just DAMP now. It was WET. I had to do something.

More Metaprogramming to the Rescue

What I had were two orthogonal decisions each with two alternatives:

level sequential enum bitwise enum
class variant 1 variant 2
instance variant 3 variant 4

(The variants are relative to the last code in the previous section.)

If I could figure a way to declare the framework code just once and call it four times filling in the correct arguments, this would solve the problem. The first step is factoring out the differences between each version.

There are three points that differ between the different variants:

One alternative is for sequential enumeration, the other for bitwise. One alternative is for Object, the other for class. The Object version requires methods on the class while the Module method requires methods on the object.

The placeholders for these values in the corresponding code become:

code = <<EOF
class object

  def method(*args)
    offset = 0
    bias = 0
    args.flatten.each_with_index do |const,i|
      case const
      when /^\+(\d+)$/
        offset = $1.to_i
        bias += 1
      when /^(\=)?(\d+)$/
        offset = $2.to_i - i
      else
        class_eval %(enum_type)
      end
    end
  end

end
EOF

where

With the appropriate values inserted into these places, and then each being evaluated in turn, we could get the four variants! Well, almost. Ruby apparently has an embedded distinction between evaluating class definitions and method definitions. Sadly, you can’t evaluate class definitions even if you’re using them to extend the class, in the same way you would when you are loading or requiring a file. (Personally I don’t see any technical reason this couldn’t be done, as it’d be possible to write a temporary file and load it at this point, but I’m not yet aware of a method that would load a string instead of a file.)

So instead, we have to remove the wrapping class definition and evaluate the method definition in it’s context:

code = <<EOF
def method(*args)
  offset = 0
  bias = 0
  args.flatten.each_with_index do |const,i|
    case const
    when /^\\+(\\d+)$/
      offset = $1.to_i
      bias += 1
    when /^(\\=)?(\\d+)$/
      offset = $2.to_i - i
    else
      class_eval %(enum_type)
    end
  end
end
EOF

object.class_eval(code)

By embedding this in a Enum object class method to generate and evaluate the metacode, we get:

# enum.rb (with metaprogramming)

class Enum
  def self.build(arguments = {})
    code = <<EOF
def #{arguments[:method]}(*args)
  offset = 0
  bias = 0
  args.flatten.each_with_index do |const,i|
    case const
    when /^\\+(\\d+)$/
      offset = $1.to_i
      bias += 1
    when /^(\\=)?(\\d+)$/
      offset = $2.to_i - i
    else
      class_eval %(#{arguments[:enum_type]})
    end
  end
end
EOF
    arguments[:object].class_eval(code)
  end
end

successor = '#{const} = #{i+offset-bias}'
bitwise   = '#{const} = #{2**(i+offset-bias)}'

Enum.build({ :object => Object,
             :method => "self.enum",
             :enum_type => successor })

Enum.build({ :object => Object,
             :method => "self.bitwise_enum",
             :enum_type => bitwise })

Enum.build({ :object => Module,
             :method => "enum",
             :enum_type => successor })

Enum.build({ :object => Module,
             :method => "bitwise_enum",
             :enum_type => bitwise })

This facility now provides all the capabilities of the previous code in a much more DRY way. Notice that we are doing two-level metaprogramming: the first meta is building the method, and the second meta inside that defines the constants. The test code for the mechanism using both the class and instance level facilities also continued to work like a champ:

# enum.rb (with metaprogramming)

require 'test/unit'
require 'enum'

module Module_test
  enum %w(MA1 MA2 +2 MA3 MA4 =7 MA5 MA6 3 MA7)
  bitwise_enum %w(MB1 MB2 +2 MB3 MB4 =7 MB5 MB6 3 MB7)
end

class TC_enum_module < Test::Unit::TestCase

  include Module_test

  def test_enum
    assert([ MA1, MA2, MA3, MA4, MA5, MA6, MA7 ] ==
             [ 0, 1, 4, 5, 7, 8, 3 ],
           "enum (instance) value assignments")
  end
end

class TC_bitwise_enum_module < Test::Unit::TestCase

  include Module_test

  def test_bitwise_enum
    assert([ MB1, MB2, MB3, MB4, MB5, MB6, MB7 ] ==
             [ 1, 2, 16, 32, 128, 256, 8 ],
           "bitwise_enum (instance) value assignment")
  end
end

class TC_enum < Test::Unit::TestCase

  enum %w(A1 A2 +2 A3 A4 =7 A5 A6 3 A7)

  def test_enum
    assert([ A1, A2, A3, A4, A5, A6, A7 ] ==
             [ 0, 1, 4, 5, 7, 8, 3 ],
           "enum (class) value assignments")
  end
end

class TC_bitwise_enum < Test::Unit::TestCase

  bitwise_enum %w(B1 B2 +2 B3 B4 =7 B5 B6 3 B7)

  def test_bitwise_enum
    assert([ B1, B2, B3, B4, B5, B6, B7 ] ==
             [ 1, 2, 16, 32, 128, 256, 8 ],
           "bitwise_enum (class) value assignment")
  end
end

class TC_enum_use < Test::Unit::TestCase
  def test_enum
    assert([ TC_enum::A1, TC_enum::A2, TC_enum::A3, TC_enum::A4, TC_enum::A5,
             TC_enum::A6, TC_enum::A7 ] == [ 0, 1, 4, 5, 7, 8, 3 ],
           "enum value assignments")
  end
end

class TC_bitwise_enum_use < Test::Unit::TestCase
  def test_bitwise_enum
    assert([ TC_bitwise_enum::B1, TC_bitwise_enum::B2, TC_bitwise_enum::B3,
             TC_bitwise_enum::B4, TC_bitwise_enum::B5, TC_bitwise_enum::B6,
             TC_bitwise_enum::B7 ] == [ 1, 2, 16, 32, 128, 256, 8 ],
           "bitwise_enum value assignment")
  end
end

Could the code be DRYer? Sure, we could pass fewer arguments to the build method, doing a little more decision making and additional assignments, but I do like the explicitness of this result. It does the job nicely, is contained in a single module, and is easy to follow. For me, that’s makes for a big win.

Being able to easily add code that is created at runtime to a running system is a real dynamic language advantage. Here I was able to increase the DRY factor significantly. There is less code to maintain and less likelihood of errors, and of course a few nice test cases that we can use to make sure we don’t botch any changes later. All in all, a much better, DRYer situation through metaprogramming.