Object#tap And How To Use It

??? words · ??? min read

This is a work in progress from The Pipeline. It will likely change before it is published to the main listing of articles. It is currently at the "detailed" stage.

The #tap method was introduced in Ruby 1.9, back in 2007. To this day, it raises questions like:

  • What does it do?
  • How is that even remotely useful, and when would I ever use it?
  • Why is it named so terribly?

After reading this article, you will be able to answer these questions.

What Does It Do?

The #tap method does two things:

  1. It calls the given block, passing self as the only argument
  2. It always returns self
return_value =
  42.tap do |argument|
    puts "argument: #{argument}"
  end

puts "return value: #{return_value}"

# Outputs:
#
#   argument: 42
#   return value: 42

The implementation is only two lines.

class Kernel
  def tap
    yield(self)
    self
  end
end

How Is That Even Remotely Useful?

Let’s start by reading the official documentation.

.tap

(from ruby core)
Implementation from Object
------------------------------------------------------------------------
  obj.tap {|x| block }    -> obj
------------------------------------------------------------------------

Yields self to the block, and then returns self. The primary purpose of
this method is to "tap into" a method chain, in order to perform
operations on intermediate results within the chain.

  (1..10)                  .tap {|x| puts "original: #{x}" }
    .to_a                  .tap {|x| puts "array:    #{x}" }
    .select {|x| x.even? } .tap {|x| puts "evens:    #{x}" }
    .map {|x| x*x }        .tap {|x| puts "squares:  #{x}" }

So the original intention of this method was to perform “operations” (a euphemism for side effects) on “intermediate results” (return values from methods in the middle of the chain).

Use Case: Debugging Method Chains

As a concrete example of the intended purpose, let’s say you’re debugging a big method chain, and your first thought is wtf is this thing even doing.

def most_frequent_words(text)
  text
    .split(/(\s|[\[\]()])+/)
    .map(&:downcase)
    .select { _1.match?(/[a-z]/) }
    .reject { _1.match?(/[a-z0-9]{3}\.md/) }
    .map { _1.tr('’“”', "'\"\"") }
    .map { strip_regex(_1, /[.,?:"_*~()\[\]]+/) }
    .reject { COMMON_WORDS.include?(_1) }
    .select { _1.length >= 2 }
    .tally
    .sort_by(&:last)
    .last(30)
    .reverse
    .to_h
end

If you are a puts debuggerer, you might try to understand it by printing out some of the return values in the middle of this chain.

Without tap, it would look something like this:

def most_frequent_words(text)
  split_parts = text.split(/(\s|[\[\]()])+/)

  puts "split_parts: #{split_parts.inspect}"

  before_tally =
    split_parts
      .map(&:downcase)
      .select { _1.match?(/[a-z]/) }
      .reject { _1.match?(/[a-z0-9]{3}\.md/) }
      .map { _1.tr('’“”', "'\"\"") }
      .map { strip_regex(_1, /[.,?:"_*~()\[\]]+/) }
      .reject { COMMON_WORDS.include?(_1) }
      .select { _1.length >= 2 }

  puts "before_tally: #{before_tally.inspect}"

  before_tally
    .tally
    .sort_by(&:last)
    .last(30)
    .reverse
    .to_h
end

There are two new variables with meaningless names, you have to reformat a bunch of stuff, and then you have to remove it all again once you’re done debugging. That’s too much effort.

With tap, the same thing can be achieved by adding just two lines of code:

def most_frequent_words(text)
  text
    .split(/(\s|[\[\]()])+/)
    .tap { |x| puts "parts: #{x.inspect}" } # <-----------------
    .map(&:downcase)
    .select { _1.match?(/[a-z]/) }
    .reject { _1.match?(/[a-z0-9]{3}\.md/) }
    .map { _1.tr('’“”', "'\"\"") }
    .map { strip_regex(_1, /[.,?:"_*~()\[\]]+/) }
    .reject { COMMON_WORDS.include?(_1) }
    .select { _1.length >= 2 }
    .tap { |x| puts "before tally: #{x.inspect}" } # <-----------------
    .tally
    .sort_by(&:last)
    .last(30)
    .reverse
    .to_h
end

The tap lines are easier to write, easier to move around, and easier to delete.

Use Case: Building And Returning An Object

Bonus Use Case: Placating Robotic Police Officers

Why Is It Named So Terribly?

Think of a phone call. The audio data is transmitted through various wires, exchanges, and radio waves, between the phones. Anywhere between the phones can be wiretapped to divert the audio to another listening device, without affecting the call. Sound familiar?

The word “wiretap” originates from a time when eavesdropping was done by placing an electrical tap on a literal piece of wire. What’s an electrical tap? It’s when you have an electrical circuit and you add new wiring to divert electricity. Sound familiar?

Electrical taps come from plumbing. Say you have a water pipe running through the kitchen wall to the bathroom — and while you want the pipe to continue carrying water to the bathroom, it might be convenient to divert some of that water to the kitchen too. You could hire a plumber to tap into the pipe and install a tap.

So if you have a chain of method calls and you want to divert the intermediate return values somewhere else, without affected the chain, you might use a method called tap. See — the name isn’t that bad, after all.

I don’t think there exists an English word that would be a really good fit for this functionality. ActiveSupport had an implementation of this a few years before Ruby did, and they called it returning. Other names that were considered include with, k, then, and_then, apply, and tee (after the CLI command, which also gets its name from plumbing). But are any of these major improvements over tap?

Got questions? Comments? Milk?

Shoot an email to [email protected] or hit me up on Twitter (@tom_dalling).

← Previously: How To Create An Anti-corruption Layer

Next up: My Post →

Join The Pigeonhole

Don't miss the next post! Subscribe to Ruby Pigeon mailing list and get the next post sent straight to your inbox.