This is a post from the Get Rid of That Code Smell series.


Removing duplication from the code is a seemingly easy task. In many cases it is pretty straight-forward - you look at similar bits of code and you move them to a common method or class that is reusable in other places. Right? No, not really. It is true that code that looks similar might be an indicator that there’s a duplication but it’s not the definitive way of determining the smell.

If we look for duplication in our code we should look for duplicated concepts instead of similarly looking lines of code. Here I want to remind you the short definition of DRY principle that I’m sure most rubyists are familiar with:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

There is a fantastic presentation by David Chelimsky from RubyConf 2010 titled “Maintaining Balance While Reducing Duplication”. If you missed it you should definitely check it out. There’s a beautiful quote in this talk:

DRY does not mean “don’t type the same characters twice”

Detecting Duplication Smell

You can use both reek and flay to aid you with finding duplication in your code but remember that these tools are not smart enough to find similar concepts. Both will find potential duplication but it’s your job to decide if it makes sense to do anything about it.

Here is an example from Virtus project where reducing duplication wasn’t such a good idea:

module Virtus
  module ValueObject
    module Equalizer < Module

      # some stuff

      def define_eql_method
        module_eval <<-RUBY, __FILE__, __LINE__ + 1
          def eql?(other)
            return true if equal?(other)
            instance_of?(other.class) &&
            #{@keys.map { |key| "#{key}.eql?(other.#{@key})" }.join(' && ')}
          end
        RUBY
      end

      def define_hash_method
        module_eval <<-RUBY, __FILE__, __LINE__ + 1
          def hash
            self.class.hash ^ #{@keys.map { |key| "#{key}.hash" }.join(' ^ ')}
          end
        RUBY
      end
    end
  end
end

The Equalizer module defines a bunch of methods. Internally it has a few private define_* methods which use @keys ivar to map it to something else. Now the metric tools complained about all these calls to @keys.map. It is true, this is a duplicated code but what’s the risk that it’s going to change? Very, very small. Is this a duplicated concept in the library? Nope.

Nevertheless I went ahead and reduced this duplication by introducing a new method so the code looked like that:

module Virtus
  module ValueObject
    class Equalizer < Module

      # some stuff

      def define_eql_method
        module_eval <<-RUBY, __FILE__, __LINE__ + 1
          def eql?(other)
            return true if equal?(other)
            instance_of?(other.class) &&
            #{compile_keys(' && ') { |key| "#{key}.eql?(other.#{key})" }}
          end
        RUBY
      end

      def define_hash_method
        module_eval <<-RUBY, __FILE__, __LINE__ + 1
          def hash
            self.class.hash ^ #{compile_keys(' ^ ') { |key| "#{key}.hash" }}
          end
        RUBY
      end

      def compile_keys(separator = ' ', &block)
        keys_map = @keys.map { |key| yield(key) }
        keys_map.join(separator)
      end
    end
  end
end

Now the metric tools won’t complain. All the calls to @keys.map has been replaced with a call to the new compile_keys method.

This refactor did reduce duplication but with the price of increased complexity. I don’t think it was such a great idea - the risk that these duplicated calls would have to change was so low that I should’ve left this code alone despite the little duplication.

Removing Duplication

Here’s another example where it actually did make sense to reduce duplication to get cleaner code and better encapsulation:

module Virtus
  class Coercion
    module TimeCoercions

      def to_string(value)
        value.to_s
      end

      def to_time(value)
        if value.respond_to?(:to_time)
          value.to_time
        else
          Coercion::String.to_time(to_string(value))
        end
      end

      def to_datetime(value)
        if value.respond_to?(:to_datetime)
          value.to_datetime
        else
          Coercion::String.to_datetime(to_string(value))
        end
      end

      def to_date(value)
        if value.respond_to?(:to_date)
          value.to_date
        else
          Coercion::String.to_date(to_string(value))
        end
      end

    end
  end
end

I think it’s easy to see duplication here. The duplicated concept is that we want to call a specific method on the value only if it’s actually available. If it’s not then we use an alternative String-based coercion.

This logic was extracted into a new method called coerce_with_method and the module looked like that:

module Virtus
  class Coercion
    module TimeCoercions

      def to_string(value)
        value.to_s
      end

      def to_time(value)
        coerce_with_method(value, :to_time)
      end

      def to_datetime(value)
        coerce_with_method(value, :to_datetime)
      end

      def to_date(value)
        coerce_with_method(value, :to_date)
      end

      def coerce_with_method(value, method)
        value.respond_to?(method) ? value.public_send(method) : Coercion::String.send(method, to_string(value))
      end
    end
  end
end

In fact this turned out to be so common that later on coerce_with_method was moved to the base Virtus::Coercion::Object class with a few overridden variations in other coercion sub-classes.

Summing Up

Reducing duplication should be based on finding similar concepts, not similarly looking code. The metric tools can only point you to potential duplication but you have to determine yourself if it really makes sense to do anything about it. In some cases you can just complicate your code just for the sake of reducing duplication without any real benefits.

OK! Next episode will be about “Primitive Obsession” which is extremely common in Ruby world.