26 January, 2021

Object#tap And How To Use It

??? words · ??? min read

The #tap method was introduced in Ruby 1.9, back in 2007. To this day, it raises questions like:

What does it do?
How is that even remotely useful, and when would I ever use it?
Why is it named so terribly?

After reading this article, you will be able to answer these questions.

What Does It Do?

The #tap method does two things:

It calls the given block, passing self as the only argument
It always returns self

return_value =
  42.tap do |argument|
    puts "argument: #{argument}"
  end

puts "return value: #{return_value}"

# Outputs:
#
#   argument: 42
#   return value: 42

The implementation is only two lines.

class Kernel
  def tap
    yield(self)
    self
  end
end

How Is That Even Remotely Useful?

Let’s start by reading the official documentation.

.tap

(from ruby core)
Implementation from Object
------------------------------------------------------------------------
  obj.tap {|x| block }    -> obj
------------------------------------------------------------------------

Yields self to the block, and then returns self. The primary purpose of
this method is to "tap into" a method chain, in order to perform
operations on intermediate results within the chain.

  (1..10)                  .tap {|x| puts "original: #{x}" }
    .to_a                  .tap {|x| puts "array:    #{x}" }
    .select {|x| x.even? } .tap {|x| puts "evens:    #{x}" }
    .map {|x| x*x }        .tap {|x| puts "squares:  #{x}" }

So the original intention of this method was to perform “operations” (a euphemism for side effects) on “intermediate results” (return values from methods in the middle of the chain).

Use Case: Debugging Method Chains

As a concrete example of the intended purpose, let’s say we’re debugging a big method chain, and our first thought is WTF is this thing even doing?

def most_frequent_words(text)
  text
    .split(/(\s|[\[\]()])+/)
    .map(&:downcase)
    .select { _1.match?(/[a-z]/) }
    .reject { _1.match?(/[a-z0-9]{3}\.md/) }
    .map { _1.tr('’“”', "'\"\"") }
    .map { strip_regex(_1, /[.,?:"_*~()\[\]]+/) }
    .reject { COMMON_WORDS.include?(_1) }
    .select { _1.length >= 2 }
    .tally
    .sort_by(&:last)
    .last(30)
    .reverse
    .to_h
end

Puts debuggerers might try to understand it by printing out some of the return values in the middle of this chain. If we were unaware that #tap existed, we might do that like so:

def most_frequent_words(text)
  split_parts = text.split(/(\s|[\[\]()])+/)

  puts "split_parts: #{split_parts.inspect}"

  before_tally =
    split_parts
      .map(&:downcase)
      .select { _1.match?(/[a-z]/) }
      .reject { _1.match?(/[a-z0-9]{3}\.md/) }
      .map { _1.tr('’“”', "'\"\"") }
      .map { strip_regex(_1, /[.,?:"_*~()\[\]]+/) }
      .reject { COMMON_WORDS.include?(_1) }
      .select { _1.length >= 2 }

  puts "before_tally: #{before_tally.inspect}"

  before_tally
    .tally
    .sort_by(&:last)
    .last(30)
    .reverse
    .to_h
end

There are two new variables with meaningless names, we have to reformat a bunch of stuff, and then we have to remove it all again once we’re done debugging. That’s too much effort.

Using #tap, the same thing can be achieved by adding just two lines of code:

def most_frequent_words(text)
  text
    .split(/(\s|[\[\]()])+/)
    .tap { puts "parts: #{_1.inspect}" } # <-----------------
    .map(&:downcase)
    .select { _1.match?(/[a-z]/) }
    .reject { _1.match?(/[a-z0-9]{3}\.md/) }
    .map { _1.tr('’“”', "'\"\"") }
    .map { strip_regex(_1, /[.,?:"_*~()\[\]]+/) }
    .reject { COMMON_WORDS.include?(_1) }
    .select { _1.length >= 2 }
    .tap { puts "before tally: #{_1.inspect}" } # <-----------------
    .tally
    .sort_by(&:last)
    .last(30)
    .reverse
    .to_h
end

The .tap { ... } lines are easier to write, easier to move around, and easier to delete.

Use Case: Building And Returning An Object

Before #tap existed, ActiveSupport already had a similar method which they called #returning. This may seem like a strange name, but that is because it was designed for a use case that is unrelated to method chains: modifying and then returning an object.

The #returning method was removed from ActiveSupport some time ago.

The #returning method is designed for the common situation where we fetch or create an object, which we want to eventually return, but that needs to be modified or configured first.

As a real-world example, let’s look at a method that creates a OpenSSL::Cipher::AES object for encrypting some data. A simple procedural approach might look like this:

def cipher
  aes = OpenSSL::Cipher::AES.new(128, :CBC)
  aes.encrypt # put the object into encryption mode
  aes.key = @key
  aes.iv = @iv
  aes
end

First the object is created, then it is mutated, and then returned from the method. It’s not immediately obvious that this method returns the aes object, until we read the final line.

Using ActiveSupport’s #returning, it would look like this:

def cipher
  returning OpenSSL::Cipher::AES.new(128, :CBC) do |aes|
    aes.encrypt # put the object into encryption mode
    aes.key = @key
    aes.iv = @iv
  end
end

This makes the intent of the method more clear. The first line indicates that the return value will be this new OpenSSL::Cipher::AES object, and the lines inside the block are to set up or configure the object.

The same thing can be written using #tap, although I don’t think it reads quite as nicely.

def cipher
  OpenSSL::Cipher::AES.new(128, :CBC).tap do |aes|
    aes.encrypt # put the object into encryption mode
    aes.key = @key
    aes.iv = @iv
  end
end

So, this second use case for #tap is when we are writing a build and return kind of method, and we want to communicate that a little more clearly. Seeing Something.new.tap do |something| as the first line acts as a shortcut to understanding the purpose of the method.

Bonus Use Case: Placating Robotic Police Officers

Early in this article I mentioned the purpose of #tap is to cause side effects. The method chaining use case shows the side effect of printing output using puts. The other use case shows mutating an object, which is also a kind of side effect. Let’s look at a combination of the two: mutating an object within a method chain.

If we were using Ruby 2.7 or later, Enumerable#filter_map would be a better choice here, but let’s say we’re using 2.6 and it’s not available.

Let’s say we want to get a list of blog post publish dates, formatted as ISO8601 strings, excluding unpublished posts where published_at is nil. We might write something like this:

blog_posts
  .map(&:published_at)
  .compact
  .map(&:iso8601)

This works, but our favourite robotic police officer might complain about it.

Offenses:

code.rb:6:3: C: Performance/ChainArrayAllocation: Use unchained map and compact! (followed by return array if required) instead of chaining map...compact.
  .compact
  ^^^^^^^^
code.rb:7:3: C: Performance/ChainArrayAllocation: Use unchained compact and map! (followed by return array if required) instead of chaining compact...map.
  .map(&:iso8601)
  ^^^^^^^^^^^^^^^

Thanks for blocking me from merging this catastrophic performance
issue (sarcasm)

After thanking whoever set up our CI pipeline, we might use #map! and #compact! like we’re being told to. These methods mutate the existing array, instead of creating and returning a new one.

blog_posts
  .map(&:published_at)
  .compact!
  .map!(&:iso8601)

But now the tests fail.

undefined method `map!' for nil:NilClass (NoMethodError)

Unlike #compact, which always returns an Enumerable, the #compact! method will return nil sometimes. Not always, just sometimes, to keep us on our toes.

To make it work reliably, we are forced to write procedural-style code like this:

times = blog_posts.map(&:published_at)
times.compact!
times.map!(&:iso8601)

Astute readers will notice that this code looks a lot like the original OpenSSL::Cipher::AES example from earlier. We have an object assigned to a variable, which we make some modifications to, and then return it.

So, instead of converting our functional-style method chain into something that a Java programmer from the 1990s would write, we can use #tap.

blog_posts
  .map(&:published_at)
  .tap(&:compact!) # <-- this line changed
  .map!(&:iso8601)

Here, #tap is being used to always return the array object, ignoring the return value from .compact!.

Is it more readable than the original implementation? No. Is it at least better than the procedural-style implementation? Maybe. Will the performance improvement be noticeable? Unlikely. But we can rest easy in the knowledge that it allocates slightly less memory, and the only trade-offs were developer time and code readability.

Why Is It Named So Terribly?

Think of a phone call. The audio data is transmitted through various wires, exchanges, and radio waves, between the phones. Anywhere between the phones can be wiretapped to divert the audio to another listening device, without affecting the call. Sound familiar?

The word “wiretap” originates from a time when eavesdropping was done by placing an electrical tap on a literal piece of wire. What’s an electrical tap? It’s when you have an electrical circuit and you add new wiring to divert electricity. Sound familiar?

Electrical taps come from plumbing. Say you have a water pipe running through the kitchen wall to the bathroom — and while you want the pipe to continue carrying water to the bathroom, it might be convenient to divert some of that water to the kitchen too. You could hire a plumber to tap into the pipe and install a tap.

So if we have a chain of method calls and we want to divert the intermediate return values somewhere else, without affecting the chain, we might use a method called tap. See — the name isn’t that bad, after all.

I don’t think there exists an English word that would be a really good fit for this functionality. The old ActiveSupport method #returning does read better for the build and return use case, but it would read worse for the method chaining use case. Other names that were considered include with, k, then, and_then, apply, and tee (after the CLI command, which also gets its name from plumbing). But are any of these major improvements over tap? Not in my opinion.

Got questions? Comments? Milk?

Shoot an email to [email protected] or hit me up on Twitter (@tom_dalling).