How to Find Ruby Code Smells with Reek

Development

Reading Time: 9 minutes

Your Ruby code smells. And it’s okay — mine does, too! Maybe it has some long methods with too many parameters, a few classes that try to do too much, an uncommunicative name here or there. No codebase is perfect, but it’s worthwhile to be aware of its deficiencies and refactorings that could improve the state of things — so let’s look at a few example code smells, potential ways to fix them and a tool that helps find them in our applications.

Duplicate Method

One of the easiest to notice (and simplest to fix) code smells is duplicate method: a situation when a given method call is made two or more times in the same context.

Let’s assume we have a class that represents an office which has a method assessing whether the current weather helps concentrate on programming tasks. To assess the weather, the Office class collaborates with a weather source and asks it for current weather at the office’s city:

class Office 
  def weather_assessment 
    if @weather_source.weather_at(@city).skies == :clear 
      'Too sunny' 
    elsif @weather_source.weather_at(@city).temp > 20 
      'Too hot' 
    else 
      'Perfect for coding' 
    end 
  end 
end

In the above implementation, the Office#weather_assessment method makes two separate WeatherSource#weather_at calls (probably resulting in firing synchronous requests over the network), each time passing the same argument. While there might be cases where this is warranted — the method’s results are volatile or explicitly calling it twice significantly improves readability — in the vast majority of cases making a single call and storing its result locally is a better solution:

class Office 
  def weather_assessment 
    weather = @weather_source.weather_at(@city) 
    if weather.skies == :clear 
      'Too sunny' 
    elsif weather.temp > 20 
      'Too hot' 
    else 
      'Perfect for coding' 
    end 
  end 
end

Data Clump

Let’s create a few more methods for our Office class — this time ones that use a navigation source collaborator to calculate distance, directions, and whether one needs a passport to get to the office’s geographical coordinates given a starting latitude and longitude:

class Office 
  def directions(source_lat, source_lon) 
    @nav_source.directions(from: [source_lat, source_lon], to: [@lat, @lon]) 
  end

  def distance(source_lat, source_lon) 
    @nav_source.distance(from: [source_lat, source_lon], to: [@lat, @lon]) 
  end

  def needs_passport?(source_lat, source_lon) 
    @nav_source.borders?(from: [source_lat, source_lon], to: [@lat, @lon]) 
  end 
end

Notice how all of these methods take the same pair of parameters? This smell is called data clump, and — except for times when the arrangement is accidental — it usually means there is a higher-level object missing from the implementation. In this case such an object could be a Location struct representing the given pair of latitude/longitude coordinates:

Location = Struct.new(:lat, :lon)

class Office 
  def directions(source_location) 
    @nav_source.directions(from: source_location, to: @location) 
  end

  def distance(source_location) 
    @nav_source.distance(from: source_location, to: @location) 
  end

  def needs_passport?(source_location) 
    @nav_source.borders?(from: source_location, to: @location) 
  end 
end

Repeated Conditional

Let’s assume that (in our remote-friendly organization) we don’t exactly know the location of some of our offices. In this case we could be tempted to guard the NavSource calls with a conditional check:

class Office 
  def directions(source_location) 
    if @location 
      @nav_source.directions(from: source_location, to: @location)   
    end 
  end

  def distance(source_location) 
    if @location 
      @nav_source.distance(from: source_location, to: @location) 
    end 
  end

  def needs_passport?(source_location) 
    if @location 
      @nav_source.borders?(from: source_location, to: @location) 
    end 
  end 
end

This smell is called repeated conditional – doing the same check over and over again suggests that there might be a better solution. Unless such a set of checks has exceptional readability, refactoring usually yields a better solution –but whether the refactoring should consist of introducing a separate LocatedOffice class or teaching the NavSource about handling NullLocations usually highly depends on the codebase in question.

Boolean/Control Parameter

Let’s now add a predicate that says whether the developers at the given Office program in Ruby — and let’s make it parameterizable so the caller can choose whether this should be a strict question (‘only Ruby developers’) or a loose one (‘Ruby developers among others’).

class Office 
  def ruby_developers?(strict = true) 
    languages = @employees.map(&:primary_language).uniq 
    if strict 
      languages == ['Ruby'] 
    else 
      languages.include?('Ruby') 
    end 
  end 
end

This kind of code is called the boolean parameter smell (a case of the more general control parameter smell); the caller knows exactly which code path the method will take and controls its execution from the outside. The typical refactorings are to either split the parameterized predicate into dedicated methods or to split the class into more classes, each of which would implement one of the code paths.

In this case it seems separate methods work quite well:

class Office 
  def only_ruby_developers? 
    languages == ['Ruby'] 
  end

  def ruby_developers? 
    languages.include?('Ruby') 
  end

  private

  def languages 
    @employees.map(&:primary_language).uniq 
  end 
end

Feature Envy

A code smell that’s a bit more problematic to refactor, but happens quite often, is feature envy.

Let’s add an Office#good_fit? method that assesses whether a given office is a good fit for a given employee (passed in an argument); we consider the office a good fit when the employee is sociable or likes the city:

class Office 
  def good_fit?(employee) 
    employee.sociable? || employee.likes?(@city) 
  end 
end

Notice how this method is much more interesting in interrogating its parameter; if it weren’t for the reference to the @city instance variable it would be a utility function — a method that operates solely on its arguments and can be moved anywhere in the system without any change to its body.

Refactoring feature envy smells is more involved, but the general notion is the same: There is a chance that this method would work better if it was implemented in the context of its parameter:

class Office 
  def good_fit?(employee) 
    employee.would_like_to_work?(@city) 
  end 
end

class Employee 
  def would_like_to_work?(city) 
    sociable? || likes?(city) 
  end 
end

Finding Smells with Reek

Now that we know a few common code smells, how can we find them in our own Ruby codebases? This is where Reek comes in — Reek is a code smell detector that can analyze Ruby files and pinpoint the smells, and you can think about it as a RuboCop for your architecture and code quality.

Reek’s usage is quite simple:

$ gem install reek 
$ reek office.rb 
office.rb -- 8 warnings: 
  [1]:Office has no descriptive comment (IrresponsibleModule) 
  [12, 18, 24]:Office takes parameters [source_lat, source_lon] to 3 methods (DataClump) 
  [13, 19, 25]:Office tests @lat && @lon at least 3 times (RepeatedConditional) 
  [40, 40]:Office#good_fit? refers to employee more than self (maybe move it to another class?) (FeatureEnvy) 
  [30]:Office#ruby_developers? has boolean parameter 'strict' (BooleanParameter) 
  [32]:Office#ruby_developers? is controlled by argument strict (ControlParameter) 
  [33, 35]:Office#ruby_developers? refers to languages more than self (maybe move it to another class?) (FeatureEnvy) 
  [3, 5]:Office#weather_assessment calls @weather_source.weather_at(@city) 2 times (DuplicateMethodCall)

Notice how Reek reports all of the code smells discussed above, as well as suggests that the original implementation of Office#ruby_developers? also smells of feature envy; this suggests that maybe the languages collection is the object that should implement the predicate. Reek also by default flags any classes and modules that lack code comments as irresponsible.

Reek can also be used on whole directories and comes with a configurable Rake task and a set of RSpec matchers, so it’s easy to integrate it into your tests or make your CI server flag any newly introduced code smells.

Sign up for a free Codeship Account

Configuring Reek

Reek’s opinions about what constitutes a code smell are by necessity very subjective; while it’s relatively easy to make blanket decisions and enforce a certain syntax style, opinions on whether a given piece of code is a smell and whether it should be refactored usually vary much more and often have to be contrasted with the readability and blunt obviousness of the original, ‘smelly’ versions.

Fortunately, this can be addressed via Reek’s high configurability — both on the project-wide level and when it comes to particular code pieces.

A project-wide (and, for a bit more fine-grained cases, directory-wide) configuration can be added by creating a YAML file ending with .reek. For example, the default configuration of the data clump smell is to aggressively flag even the smallest clumps as soon as they’re repeated:

DataClump: 
  max_copies: 2 
  min_clump_size: 2

Conversely, a method is considered to have a long parameter list when it has four or more of them — except for constructors, which can take up to five without being flagged:

LongParameterList: 
  max_params: 3 
  overrides: 
    initialize: 
      max_params: 5

Likewise, a method is considered to have too many statements when it has more than five of them — except for constructors, which are a bit harder to split in practice:

TooManyStatements: 
  max_statements: 5 
  exclude: 
  - initialize

The exclude setting can be tuned as needed — from just a single case of AClass#some_method to a regular expression matching a set of methods, like /meth/.

At the other end of the configuration applicability any method can be decorated with a set of magic comments, altering Reek’s flagging on a per-method basis — e.g., when we’re okay with a method call being made twice in a particular case:

# :reek:DuplicateMethodCall: { max_calls: 2 } 
def weather_assessment 
  if @weather_source.weather_at(@city).skies == :clear 
    'Too sunny' 
  elsif @weather_source.weather_at(@city).temp > 20 
    'Too hot' 
  else 
    'Perfect for coding' 
  end 
end

Such a comment will both have a higher chance to be moved along with the method when it’s refactored and keeps guarding the implementation from adding a third call (at which point the method gets smelly again).

Whither Now?

Now that you know how to find — and, potentially, refactor — smells in your code it’s super important to remember that smells do not necessarily mean that the code is bad; to quote the venerable Content Creation Wiki:

Note that a code smell is a hint that something might be wrong, not a certainty. A perfectly good idiom may be considered a code smell because it’s often misused, or because there’s a simpler alternative that works in most cases. Calling something a code smell is not an attack; it’s simply a sign that a closer look is warranted.

While it might be tempting to go all out and try to fix each and every code smell until Reek no longer points anything out, this is not the point. To quote the Wiki again:

Rules are for the guidance of the wise and the obedience of fools. There are two major approaches to programming:

  • Pragmatic: code smells should be considered on a case by case basis
  • Purist: all code smells should be avoided, no exceptions

…and therefore a code smell is a hint of possible bad practice to a pragmatist, but a sure sign of bad practice to a purist.

I highly recommend investigating your code smells and refactoring the ones that are indeed problematic — but also learning how to accept the ones that are not; both experiences will let you level up as a developer. You can read about more code smells (and potential refactorings) in Reek’s docs.

(If you enjoyed this post, check out my Ruby Smells talk from this year’s RubyConf Portugal, which shows more smells and refactorings and elaborates on Reek.)

Posts you may also find interesting:

Subscribe via Email

Over 60,000 people from companies like Netflix, Apple, Spotify and O'Reilly are reading our articles.
Subscribe to receive a weekly newsletter with articles around Continuous Integration, Docker, and software development best practices.



We promise that we won't spam you. You can unsubscribe any time.

Join the Discussion

Leave us some comments on what you think about this topic or if you like to add something.

  • Pingback: 1 – How to find ruby code smells - Exploding Ads()

  • Piotr Bliszczyk

    I found best usage of Reek and RuboCop by integrating them with editor. In my case atom, which, has great linter package that check your code on fly and gives you immediate warnings. I feel this is much better than getting warning from CI.

    • Greg Navis

      I also prefer a shorter feedback cycle. I want to keep me Vim config simple so what I do instead is a git pre-commit hook that runs rubocop, reek, etc.

  • Pingback: 6 «вредных» советов разработчику - itfm.pro()

  • Pito Salas

    Why do you like Reek over Rubocop? They seem so similar. I use Rubocop already. I can’t tell if I need to introduce Reek as well into my toolchain.

    • Hey, apologies for getting back to you so late and thanks so much for your patience!

      I think RuboCop and Reek are quite different – Reek, rather than concentrating on the syntax, tries to point out potential architectural problems in a codebase on an OOP level. Reek’s results are much less rules and much more suggestions where to look for potential issues, and can’t really be auto-corrected (but rather require – often quite extensive – refactoring).

      There are relatively few areas covered by both RuboCop and Reek; RuboCop is more of a syntax style enforcer, while Reek is an architectural tool. There are some common areas of interest –e.g., RuboCop tries to eradicate long methods because they’re a readability issue, while Reek points them out as potential code smells that mix the level of abstraction – but in general the best approach is to figure out what are acceptable thresholds for a given project (e.g., maximum number of lines in a method or maximum number of instance variables) and adjust them as necessary.

      In general, RuboCop is much more opinionated because it usually deals with much cleaner choices – ‘this kind of syntax should be indented this many spaces’ or ‘single quotes where possible vs double quotes everywhere’ are relatively easily made choices that should indeed be uniformly enforced across the whole codebase if possible. Reek seldom deals in absolutes; there are often cases where a given piece of logic *technically* should live closer to class A, but *in practice* is much more useful when put in class B.

      My general rule of thumb for RuboCop is ‘try to please it as much as possible and only reconfigure it on the most blatant disagreements’, while my rule of thumb for Reek is ‘this are fascinating suggestions, let me look at the smells one-at-a-time and I’ll probably whitelist quite a few of them – and that’s perfectly OK’.