19 February, 2017

Methods Can Be Longer Than Five Lines

??? words · ??? min read

So you’ve decided to apply Sandi Metz’s Rules For Developers, including:

Rule #2: Methods can be no longer than five lines of code.

This rule sounds simple in theory, but you will discover difficulties when applying it strictly.

Let’s look at the pros and cons of short methods, and when it’s OK to break the rule.

The Good

Short methods are usually easier to read and understand.

Small, well-named methods result in self-documenting code. Method names are a kind of documentation – they describe what a chunk of code does. Since every method is small, you will need to write more methods, which means more names.
Small methods have lower cyclomatic complexity – fewer loops, branches, early returns, etc.
Small methods have fewer local variables, and it’s easier to see where all the variables are used.

I have some – let’s call them “interesting” – stories about that particular codebase.

Before we get into the bad stuff, I want to make it clear that short methods are a good thing, in general. I have worked with functions that are many hundreds of lines long, with dozens of mutable local variables, and return statements everywhere. Simple questions like “what’s in this local variable?” and “will this line of code run?” were very difficult to answer. Let me assure you that long functions can be a special kind of nightmare.

The Bad

As always, there are trade-offs.

Short methods require more code – more methods, more classes, more lines of code in total. Code contains bugs, takes time to read/understand, takes time to execute, and takes time to write – all things we want to minimise. As a general rule, when two different implementations achieve the same result, the one with less code is superior. No code is better than no code.
Extracting code into separate methods creates indirection. Functionality is spread across multiple methods, possibly in different classes. Instead of reading just one method, you may need to read four methods, and understand how they all interact with each other.

Sometimes, even though each individual method becomes simpler, the overall complexity increases. Small methods do not destroy complexity, they push it up into the class or the system.

It’s Just A Guideline

The number five is arbitrary. Well-written methods do not magically turn into spaghetti if you add a sixth line.

The “five lines” rule is intended as a guideline or a suggestion, not a law. It’s a good default, and it’s catchy, but it’s also an oversimplification.

On Bike Shed LIVE! at RailsConf 2016, Sandi said this:

Cargo cult programming is programming using rituals that serve no real purpose. It usually happens when one programmer copies the behaviour of another programmer, without understanding the reasoning.

I was helping a company in San Fransisco, doing a little consoluting gig. They had the classic “many things too big” problem – “many things too connected” problem. They had a lot of 5,000 line ActiveRecord models. And so, a lot of controllers that knew everything, a lot of views that needed 20 or 30 instance variables. […] So I made those rules up as an antidote for a very specific, bad situation.

And then I went on the Ruby Rogues and […] I repeated them, on the air, in public, and that was like a bomb went off.

There are actually six rules, and the sixth rule is that you can break any of the first five, as long as you can get your pair to agree. Why is it that we’re such cargo culters about it? It’s like, that’s the rule that people forget.

— Sandi Metz

When To Break The Rule

Here are three questions you can ask when you’re considering breaking the rule:

Would a method name make the code more clear? A good method name can make code more readable, but a bad method can make the code worse. For example, x % 5 == 0 could be written more clearly as x.divisible_by?(5).
Would multiple smaller methods be easier to understand? The larger a method becomes, the more confusing it gets. Decomposing a large method into simpler parts often makes the whole thing easier to understand – but not always.
Would the smaller methods be simple? Each smaller method should be obviously simpler than the large method – fewer arguments, fewer local variables, fewer loops and branches. If all the small methods are complicated too, then you’re not going to see any benefit.

If you can answer “yes” to all three questions, then go ahead with the refactoring. If all the answers are “no”, keep the long method the way that it already is. If the answers are mixed, then go with your intuition.

An Example

This article was inspired by a question on reddit. Someone asked how to get the following method to be under five lines:

def bin_search(array, target)
  min = 0
  max = array.length - 1
  while min <= max
    guess = (min + max) / 2
    case array[guess] <=> target
    when -1 then min = guess + 1
    when 0  then return guess
    when 1  then max = guess - 1
    end
  end

  -1
end

Another person suggested this refactoring:

def bin_search(array, target)
  scope, guess = initial_guess(array)
  until search_complete(array, guess, target) do
    scope, guess = improve_guess(scope, guess)
  end
  search_result(array, guess, target)
end

Let me start by saying that this refactoring is a good demonstration of how to break down a large method into smaller ones. It shows how to pick chunks of code, and extract them into new methods, in a sensible way.

However, I’m going to argue here that the original method should not be broken down, because it makes the code worse. Referring to the questions above:

Do the four new method names initial_guess, search_complete, search_result, and improve_guess make the code more clear? I say no. This is an implementation of the binary search algorithm, and the algorithm does not include any of those names. I can’t even see the algorithm anymore, because these new methods are hiding the implementation. The new names make the implementation less clear than the original, in my opinion.
Are five smaller methods easier to understand than one big method? Again, I say no. Spreading the algorithm across five methods has made it more difficult to understand, as a whole. And to complicate things further, they introduce a new “scope” concept, which didn’t exist in the original implementation, and doesn’t exist in the binary search algorithm either.
Are the four new methods simple? I don’t think so. Three of the four new methods take more arguments than the original method. All the new methods are tightly coupled together, with shared arguments and return values. They are not independant or reusable.

Personally, I would actually make some minor changes to the original method, but those changes aren’t relevant to the topic of this article.

In my opinion, this refactoring is making the code worse. Turning an 11 line method into five separate methods is just creating more code with no benefit. The original method is good enough.

Got questions? Comments? Milk?

Shoot an email to [email protected] or hit me up on Twitter (@tom_dalling).

The Good

The Bad

It’s Just A Guideline

When To Break The Rule

An Example

Join The Pigeonhole