18 July, 2016

Isolate Side Effects – Functional Style in Ruby

??? words · ??? min read

In this article, we’ll look at what side effects are, why they are problematic despite being necessary, and how to isolate them to minimise their drawbacks.

This is part of a series about to incorporating functional programming concepts into Ruby code, in a pragmatic way – something I call “functional style.”

What Are Side Effects?

Side effects are any observable change caused by calling a function. For a function to be free of side effects, it must do nothing other than provide a return value.

These are examples of side effects:

Writing to a file
Printing output (e.g. puts('hello'))
Raising an exception
Mutating a non-local variable

Side Effects Cause Bugs

Have you ever been hunting down a bug, and narrowed it down to a single statement – where you can comment out just one line of code, and the bug magically disappears? That is the sting of an unexpected side effect. We’ve all been there.

If you’re not careful, side effects can proliferate as the codebase grows, making them difficult to anticipate. When we call functions without fully understanding all of their side effects, that’s when the bugs crawl out.

As an example, let’s look at some typical Rails controller code:

class UsersController < ApplicationController
  def update
    @user = User.find(params[:id])
    if @user.update(user_params)
      redirect :back, success: 'Updated successfully'
    else
      render 'edit'
    end
  end

  def user_params
    params.require(:user).permit(
      :username,
      :password,
      :password_confirmation,
      :profile_image,
    )
  end
end

What are the side effects of the user_params method? There are none, as far as I can tell. It takes the params hash, transforms it, and returns the result. It does nothing but provide a return value, so it’s side-effect-free.

What are the side effects of the @user.update method call? Well it updates the database at some point, but it could be doing all kinds of other things.

The @user object itself is getting its attributes mutated, even if validation fails. What if the view requires the previous valid values? Do I need to reload the current_user object to get the new values?
Speaking of validation, that’s going to mutate the errors object on the model. That has to happen before the view gets rendered.
Surely that password isn’t being saved as plain text. Something must be going on under the hood to hash it.
It’s probably logging something. Hopefully the password isn’t being written out in plaintext. Better check that.
What if it’s sending an email in the after_save callback? If so, is it supposed to send an email in this particular situation, or do I have to suppress it?
What about that profile_image attribute. That looks like a file. Could it possibly overwrite an existing file? Is it sitting in a temporary directory that needs to be cleaned out periodically? If it’s stored on the disk, what happens if there isn’t enough disk space available? If it’s being uploaded to S3 instead, should I be expecting some kind of exception to be raised if the upload fails?

These are all implicit side effects. It’s difficult to predict what will happen by looking at the calling code, which is fine if you understand the entire codebase, but dangerous if you don’t.

The point I’m trying to make is that side effects can be unpleasant surprises. They’re just thorny by nature.

Side Effects Are Hard To Test

Side effects are often slow IO operations, like writing to a disk or a socket. If you want to test these things, don’t expect your test suite to be snappy. Take a PostgreSQL database in a typical Rails app, for example. When database access is sprinkled through basically every single part of an app, almost every test is going to hit the database at some point, so it should be no surprise when the test suite is slow. On the other hand, if the majority of the tests are isolated, but a few of them hit the database, you would expect the test suite to be much faster.

To speed up the test suite, you might start using stubs and mocks to avoid actually hitting the database. But if the side effects are still sprinkled everywhere through the codebase, you’re going to be doing a lot of stubbing and mocking – so much that it can turn your tests into spaghetti.

* In FP terminology, a “pure function” is a function that generates a return value based on its arguments, and does nothing else. Pure functions can not have side effects, and must always return the same value given the same arguments. This means that pure functions depend upon nothing but their arguments and constants.

Testing pure* functions is easy. You put some arguments in, you get a return value out, then you run some assertions on the return value. That’s it. No mocking or stubbing, and it’s usually super fast. Plus it’s easy to write my favourite kind of tests: truth tables. It’s the ideal kind of unit test.

Side Effects Are Necessary

If an app has no side effects, it doesn’t actually do anything. A typical web app has plenty of side effects, like updating/inserting into the database, logging, pushing background jobs onto queues, and interacting with external web APIs. These things just can’t be removed. Even sending a HTTP response back to the browser involves writing to a socket, which is a kind of side effect.

So how do you handle code that is necessary, but problematic? Isolate it. Extract the side effects into separate units, so that the remaining code can be written in a functional style.

Gary is back! Keep your eyes peeled for new screencasts on Destroy All Software.

If this sounds familiar, you may be remembering Gary Bernhardt’s excellent talk: Boundaries. In the talk he explains what he calls “functional core, imperative shell.” The functional core is the domain logic, written in a functional style. The imperative shell uses the results of the functional core to cause the necessary side effects. I highly recommend watching this talk if you haven’t seen it before, and even if you have, it’s worth rewatching.

Avoid Mutation

The previous article was about avoiding mutation, but I want to touch on how mutation causes side effects.

Think about how the following method could cause bugs.

def case_insensitive_equals(str1, str2)
  # I've heard that `upcase!` is faster than `upcase`,
  # so this method should be ultra fast.
  return str1.upcase! == str2.upcase!
end

If you need a hint, consider this usage code:

BOSS_NAME = 'Joanna Smith'

if case_insensitive_equals(current_user.name, BOSS_NAME)
  current_user.is_the_boss = true
  current_user.save!
end

Firstly, the BOSS_NAME constant gets changed, and that’s the opposite of what constants are supposed to do. Secondly, it converts the boss’ name to uppercase and saves it to the database. Now the web app is going to be shouting at the boss.

This is all caused by the nasty mutation side effect of the case_insensitive_equals method. Methods that are called for their return value usually don’t mutate their arguments. Because the side effect is unexpected, it can easily cause bugs, and those bugs are particularly hard to find.

The example above involved method arguments, but mutating any kind of shared state is a side effect. This includes globals, instance variables, and even updating a database.

Isolated Mutation Is OK

Mutation isn’t that bad, per se. It’s the side effects caused by mutation that get you. But what if you could use mutation without any side effects?

Consider this example, taken from the previous article:

def symbolize_keys(hash)
  result = {}
  hash.each do |key, value|
    result[key.to_sym] = value
  end
  result
end

This method repeatedly mutates the hash inside the result variable. However, the mutations are not observable from the outside, because they all happen within local scope. The method does nothing other than provide a return value, so it is side-effect-free.

This kind of side-effect-free mutation within local scope is the least harmful kind of mutation. Any bugs that get introduced will be limited to the method itself, and are unlikely to break other parts of the codebase. I still recommend avoiding mutation by default, but you can relax that rule when the method has no side effects.

If you really want to mutate an argument, or an instance variable, or any piece of shared state, be sure to make a duplicate. The duplicate is isolated to local scope, so mutating it won’t cause side effects.

Use Imperative Style For Side Effects

When it comes to the parts of your app that are specifically designed to cause side effects, you basically have to give up on writing functional-style code. It’s just the wrong tool for the job. Instead, switch to imperative style.

Imperative style code is just a sequence of statements, where each statement has some kind of side effect. Here is an abbreviated example from a real web app:

module Commands
  class StopRecording

    def call
      if ffmpeg_pids.any?
        kill_ffmpeg_processes!
        SessionRepo.update(session_id, live: false)
        AngleRepo.update({session_id: session_id}, ffmpeg_pid: nil)
        wait_for_processes_to_die!
        Result.success
      else
        Result.failure("Recording was already stopped")
      end
    end

    # ... code omitted ...

  end
end

The success branch is just four statements and a return value. You can tell that they have side effects because the return values are ignored. The bangs (!) on the method names are also a hint. These are explicit side effects. Each line has one fairly obvious, predictable effect.

Notice the return value. I could get rid of the Result class and just raise an exception to indicate failure, but I purposely chose to avoid exceptions. Exceptions are a kind of side effect that is forced upon the caller. I’m trying to isolate the side effects, not propagate them into the calling code.

This kind of code will require tests with mocking and stubbing. But hopefully it will require fewer tests, because some of the logic has been extracted out of it.

Example Refactoring To Isolation

Let’s refactor some hypothetical code for sending monthly bills.

class MonthlyBillingJob
  SENIOR_CITIZEN_DISCOUNT = 5

  def perform
    Account.where(free: false) do |account|
      bill = Bill.new(
        account: account,
        amount: account.plan.amount,
      )

      if account.type == :senior_citizen
        bill.amount -= SENIOR_CITIZEN_DISCOUNT
      end

      bill.save!

      BillMailer.new_bill(bill).deliver_now
    end
  end
end

Before we change anything, think about writing tests for the class above. How hard will it be to test the senior citizen discount? How hard will it be to test that bills are inserted into the database, and emails are sent? We’ll revisit this after refactoring.

There are two side effects to isolate: creating the bill in the database, and sending the email. The first thing I want to think about is making those side effects explicit, and minimal. Here is the imperative-style implementation that I would like to see:

billable_accounts.each do |account|
  bill = create_bill!(account)
  send_email!(bill)
end

The next thing I want to think about is extracting the domain logic. The senior citizen discount is definitely domain logic, and so is the query for billable accounts. I’m going to pull both of those out into a functional-style module.

While querying the database is not functionally pure, I’m going to include it in the functional-style code. Queries are generally free from observable side effects, with the exception of a few edge cases like the N+1 problem.

module Billing
  SENIOR_CITIZEN_DISCOUNT = 5

  def self.billable_accounts
    Account.where(free: false)
  end

  def self.monthly_bill(account)
    Bill.new(
      account: account,
      amount: account.plan.amount - discounts(account),
    )
  end

  def self.discounts(account)
    if account.type == :senior_citizen
      SENIOR_CITIZEN_DISCOUNT
    else
      0
    end
  end
end

Notice that the monthly_bill method creates an ActiveRecord object. Just creating the object is fine, because it has no side effects. Actually inserting the object into the database is a side effect, and I wouldn’t want that in this functional-style code.

After incorporating the above functional-style code, the job class looks like this:

class MonthlyBillingJob
  def perform
    Billing.billable_accounts.each do |account|
      bill = Billing.monthly_bill(account)
      bill.save!
      BillMailer.new_bill(bill).deliver_now
    end
  end
end

It’s not the exact implementation I was aiming for, but it’s close. I could make it exact by introducing a few extra methods, but I think it’s already clear enough. While I would slightly prefer send_email!(bill) to BillMailer.new_bill(bill).deliver_now, there really isn’t much of a difference.

Now let’s analyse the results a bit.

How hard is it to test that bills are created, and emails are sent? Since it has been decoupled from the domain logic, the MonthlyBillingJob class only needs a single test. The test needs mocks, but it’s not too bad:

job = MonthlyBillingJob.new
bill = double
mailer = double

expect(Billing).to receive(:billable_accounts).and_return([:sams_acc])
expect(Billing).to receive(:monthly_bill).with(:sams_acc).and_return(bill)
expect(bill).to receive(:save!)
expect(BillMailer).to receive(:new_bill).with(bill).and_return(mailer)
expect(mailer).to receive(:deliver_now)

job.perform

Notice how the test never actually needs to use a real Account or Bill object. This is a good thing, because it makes the MonthlyBillingJob class more resistant to change. Removing all database access should also ensure that the test is fast.

The job class is dumb – it doesn’t know or care about most of the domain logic. We can change how the accounts are queried, or how the bills are created, and the job class will still work without modification.

Moving on to the functional-style code, how hard is it to test the seniors discount? It’s dead easy:

account = Account.new(type: :senior_citizen)
discount = Billing.discount(account)
expect(discount).to eq(5)

And it will continue to be dead easy every time you need to change the domain logic for discounts.

All the methods on the Billing module could have been kept on the MonthlyBillingJob class. They could still be written in a functional style, and be stubbed out in the same way in the test. However, even with the side effects isolated into separate methods, I don’t like the two styles of code being right next to each other in the same class. I’d rather have a separate file that I could point to and say “there’s the billing logic.” I think that’s better than having it sprinkled all over the codebase. I prefer clear boundaries between the functional and non-functional code.

Applying It To Rails: Prefer Skinny Models

Globals suck, right? That bit of wisdom has been drilled into programmers for a long time now.

They suck because they are a prime example of shared mutable state. They allow bits of your app to communicate with each other via mutation side effects, causing otherwise separate units of code to be coupled together.

Guess what else is shared mutable state: the database in a Rails app. The model classes in a Rails app are basically global variables, and they suffer from the same problems. They can be, and often are, used from everywhere in the codebase. That creates a huge surface area for introducing bugs.

With that in mind, do you really want to take the “fat model, skinny controller” approach and implement your app’s functionality in the model layer? I don’t. I would rather use ActiveRecord for interacting with the database, and implement the domain logic separately, preferably with functional-style code.

Keep in mind that MVC isn’t a set of three big buckets that all code has to fall into. Just because functionality is being removed from the models doesn’t mean it goes into the controllers. The side-effecty code can be moved into a new layer, usually named something like service objects, operations or commands. The functional-style code can live in the lib/ directory, or maybe somewhere new like app/domain/.

Enlightenment comes when you use objects in a server-side web app to model actions, not things.

– Brad Urani

And if you value your sanity, stay away from ActiveRecord::Callbacks. They are specifically designed to cause implicit side effects, like automatically sending an email after you save a model object. That’s the diametric opposite of what I’m trying to argue for in this article.

Summary

The aim is to minimise the naughty code and maximise the nice code, like some kind of weird software developer Santa Claus. To achieve this, code is split into two parts: the functional part and the imperative part.

The functional part is simpler, easier to test, and just easier to work with in general. Ideally, this part contains the domain logic of your app – the real-world rules and decisions that your app handles for its users.

The imperative part is still necessary and important, but it’s more prone to bugs, and harder to test. This part ideally contains minimal domain logic. It should blindly carry out orders like “send this email” or “update this row in the database.” The less it does, the better.

The functional-style code is the brain, and the imperative code is the brawn. Keeping the two separate is worth the effort, in my opinion.

Got questions? Comments? Milk?

Shoot an email to [email protected] or hit me up on Twitter (@tom_dalling).