3 July, 2016

Avoid Mutation – Functional Style In Ruby

??? words · ??? min read

This article is about incorporating functional programming concepts into Ruby code, in a pragmatic way – something I call “functional style.”

And when I say pragmatic, what I mean is that the code should still mostly look and feel like idiomatic Ruby. Ruby is not Haskell, and nor should it be. The idea is to work with the grain of the language, not against it. The code should be easy for any Rubyist to understand – hopefully even easier than what you’re accustomed to.

So let’s look at avoiding mutation: the benefits, how to do it, the drawbacks, and where it might not be appropriate.

Why You Should Avoid Mutation

Mutation is a source of bugs. Therefore avoiding mutation will reduce the number of bugs that you introduce. Whenever you change a variable, there is always the possibility that you have broken some other piece of code that relied upon it. Avoiding mutation makes certain types of bugs impossible to introduce.

Mutation requires you to spend extra mental energy when reading and writing code. When you write code that changes a variable, you have to analyse all the ways that change could affect other pieces of code. Whenever you read code containing mutations, you have to analyse all the different states that the variable might have, and when those states can change. You can bypass this analysis and reduce the mental effort required by simply avoiding mutation.

There is also potential for performance benefits, which we will get to later in this article.

In summary, avoiding mutation makes your code easier to read, easier to write, and less buggy. It gives you confidence, and reduces the need for frustrating debugging.

I propose that you should avoid mutation wherever possible. It should be the default way to write code, and each deviation should require a good reason.

Pretend All Values Are Immutable

Pretend that everything is immutable. I say “pretend” because practically everything in Ruby is mutable by default, so trying to enforce immutability everywhere is painful. It’s more pragmatic to accept that Ruby is a highly mutable language, and just use discipline.

Despite all the mutability, the Ruby standard library actually makes this fairly easy. Most destructive* methods have non-destructive alternatives. Here are a few examples:

* The term “destructive” used here has a specific meaning in FP. Mutation is referred to as a destructive update because it overwrites the previous value. A non-destructive update creates a new value, leaving the old value intact.

String#upcase! vs String#upcase
Hash#[]= vs Hash#merge
Array#concat vs Array#+
Array#shift vs Enumerable#drop(1)

The Enumerable mixin is your best friend here, because all of its methods are non-destructive by design. Make sure you know how to use every method in Enumerable, and pay specific attention to the FP Triforce: Enumerable#map, Enumerable#select, and Enumerable#reduce.

Example:

#
# FUNCTIONAL STYLE
#
def symbolize_keys(hash)
  hash
    .map { |key, value| [key.to_sym, value] }
    .to_h
end

#
# NON-FUNCTIONAL STYLE
#
def symbolize_keys(hash)
  result = {}
  hash.each do |key, value|
    # mutating the `result` hash
    result[key.to_sym] = value
  end
  result
end

Don’t Reassign Variables

After you’ve created a variable by assigning its initial value, leave it alone. If you said that x = 5, don’t come along later and say that x += 2. Decide on what value x should have, and stick to it.

If you need to create a new value based on an existing one, create a new variable for it. Instead of x += 2, you could write new_x = x + 2.

Example:

#
# FUNCTIONAL STYLE
#
def travelling_expenses_total(expenses)
  expenses
    .select{ |e| e.type == :travelling }
    .map(&:amount)
    .reduce(0, :+)
end

#
# NON-FUNCTIONAL STYLE
#
def travelling_expenses_total(expenses)
  total = 0
  expenses.each do |e|
    # reassigning `total`
    total += e.amount if e.type == :travelling
  end
  total
end

Design Classes To Be Immutable

Whenever you need to write a new class, try to make it immutable.

Immutable classes all follow a simple pattern: never reassign or mutate instance variables. This usually means that you assign all the instance variables within initialize, and then do not provide any methods that could change them.

Example:

#
# FUNCTIONAL STYLE
#
class MicroBlogPost
  attr_reader :title, :body

  def initialize(title, body)
    @title = title
    @body = body
  end

  def rename(new_title)
    MicroBlogPost.new(new_title, @body)
  end
end

# example of creation:
post = MicroBlogPost.new('Hi', 'This is my first post')

# example of update:
renamed_post = post.rename('First Post')


#
# NON-FUNCTIONAL STYLE
#
class MicroBlogPost
  # this defines methods for reassigning instance variables
  attr_accessor :title, :body
end

# example of creation:
post = MicroBlogPost.new
post.title = 'Hi'
post.body = 'This is my first post'

# example of update:
post.title = 'First Post'

The functional-style MicroBlogPost class above requires more boilerplate than the other one, but there are gems that help to get rid of that. For an overview of these gems, check out the previous article: A Review Of Immutability In Ruby.

Performance Problems

Probably the most commonly cited problem of immutability is performance. Non-destructive updates often require a lot of duplication. Duplication takes time to run, consumes extra memory, and makes more objects for the garbage collector to clean up. In theory, this means your app will have worse performance than its mutable counterpart.

In practise, however, performance is rarely an issue. Typically you’re only working with small data sets, like an array of 100 immutable objects. On this scale, the performance differences are practically imperceptible. If your app has very strict performance requirements, it’s probably not going to be written in Ruby in the first place.

Performance problems can become noticeable with large data sets. Repeatedly duplicating an array with millions of elements will be slow, and create memory pressure. In these situations, you have a few options:

Use lazy enumerators to avoid some of the duplication. Usually, if you chained three calls to map on an array, it would create three new arrays. With Enumerator::Lazy, only the final result array would be created.
Use a streaming API design. Here is a recipe for terrible performance: read a huge file into memory, non-destructively update each line, and write the results out to a new file. Instead, consider a streaming API that reads, updates, and writes each line, one at a time. This will alleviate the performance problems caused by memory pressure, and the update step inbetween reading and writing can still be written in a functional style.
Use persistent data structures. Persistent data structures are immutable collections, like arrays and hash maps, that are specifically designed to have good performance for non-destructive updates. They reduce duplication by sharing state, under the hood. In Ruby, you can get these from the hamster gem.
Just use mutable data. Performance can be a perfectly valid reason to use mutable data. This is fairly rare though, so fight the urge to optimise prematurely.

Performance Benefits

Perhaps counterintuitively, immutability can actually lead to better performance.

Mutability has its own source of duplication: defensive copying. Defensive copying has all the performance problems of non-destructive updates, except it’s harder to predict when it will happen. Defensive copying is not necessary for immutable objects.

Concurrent access to mutable data usually requires some sort of coordination, such as a mutex or a semaphore. This can cause performance problems relating to locking, which immutable data does not suffer from. This isn’t much of a consideration for MRI because it lacks proper concurrency, but it may be a consideration for JRuby.

Last but not least, I believe that writing code with immutability in mind results in simpler code. Simpler code usually means less code, and less code usually means faster code. This a personal opinion of mine, and I know that a lot of people will disagree with me, but I’m also not the only person to think this way. For a real-world example of this, have a look at the consistently good performance of ROM and the dry-rb gems.

Where Immutability Is Inappropriate

Avoiding mutability is a good default, but it’s not appropriate in all situations.

We’ve already covered performance. There are some situations where immutability would cause an unacceptable loss of performance.

Sometimes an implementation can be made simpler by using a little bit of mutable state. I’ve found that writing parsers is a good example of this. It’s entirely possible to write parsers that avoid mutability, but in my experience it’s quite a lot simpler to write parsers that consume their input as they run. Consuming input usually means reading from an IO stream or popping tokens off of an array, both of which are mutations.

Another situation I’ve found is writing DSLs in Ruby. DSLs are typically a set of statements, where each statement causes some kind of mutation. In the Rails routing DSL, for example, every time you use get, post or resources, new route objects are being created and added to the set of all routes. In situations like these, where a data structure is being built one step at a time, it might be simpler to implement this with mutation. After the data structure has been built, however, you can start treating it as immutable again. Rails routes basically work like this – you build up your routes at boot time, and they remain constant after that. Think of it like a complicated constructor function.

One decent way to compare implementations is by the amount of code. If the functionality is equivalent, the implementation with less code is usually better. If you can get a substantial code reduction by using mutable data, then that is a valid reason to do so. Evaluating implementations is way more complicated than just looking at the amount of code in each, but it is a good rule of thumb.

Conclusion: It’s All About Discipline

Ruby is an extremely flexible language, and that is a double-edged sword. You can use it to write a dream codebase or a maintainability nightmare. Ruby is a sharp tool, and it is up to you, the developer, to use it responsibly.

The resulting codebase that you get largely depends upon discipline: the rules that you choose to consistently apply to your code. I like to say that Ruby’s motto is “you can, but don’t.”

Avoiding mutability by default is, in my opinion, a rule worth applying.

Ruby: you can, but don't The Ruby Logo is copyright Yukihiro Matsumoto. Licensed under CC BY-SA 2.5

Got questions? Comments? Milk?

Shoot an email to [email protected] or hit me up on Twitter (@tom_dalling).

Why You Should Avoid Mutation

Pretend All Values Are Immutable

Don’t Reassign Variables

Design Classes To Be Immutable

Performance Problems

Performance Benefits

Where Immutability Is Inappropriate

Conclusion: It’s All About Discipline

Join The Pigeonhole