Creating Advanced Active Record DB Queries with Arel

Development

If you’ve used Rails, you’ve likely used Active Record as a way to access and record database transactions in a more Ruby-like manner. Active Record goes a long way in simplifying the use of a database. But more then that, it’s designed in such a way that we don’t have to worry about the particulars for working with specific databases. The mapping for each database is stored and handled within Active Record.

Under the hood, Active Record has its own private API to build database SQL queries called Arel. Arel is a quite powerful SQL AST manager that lets you appropriately combine selection statements for simple to very complicated queries.

However, as it is a private API, you should be cautioned that using it has a cost of potential breaking changes should you upgrade Rails; Arel seems to change its major version each Rails release. To help resolve breaking changes, you may simply write your own predicate methods. We’ll cover that toward the end.

One nice thing about Arel is you can work with very complex queries without having to write, or learn, SQL. But when you can afford the time, learning SQL will definitely be advantageous.

The Basics of Arel

Every Active Record model has arel_table on it, which allows you to build your AST for that model.

my_table = Address.arel_table

One nice thing about Arel is that it never touches the database itself, so you can construct custom queries even if the model doesn’t exist by defining your own Arel table. It’s not until you pass it in as a parameter to the where method on the model’s class that it queries the database.

my_table = Arel::Table.new(:addresses) # Not recommended usage in Rails 5.0/Arel 7.1

To start building your query, you chain methods of column names and use predications of their values. Continuing from either of the examples above:

Address.where(my_table[:id].eq(1)).to_sql
# SELECT "addresses".* FROM "addresses"  WHERE "addresses"."id" = 1
Address.where(id: 1).to_sql
# SELECT "addresses".* FROM "addresses"  WHERE "addresses"."id" = 1

Address.where(my_table[:id].in([1,2,3])).to_sql
# SELECT "addresses".* FROM "addresses"  WHERE "addresses"."id" IN (1, 2, 3)
Address.where(id: [1,2,3]).to_sql
# SELECT "addresses".* FROM "addresses"  WHERE "addresses"."id" IN (1, 2, 3)

As you can see, the same queries are getting built in the where as from the Arel table. The where method on the Active Record models is a very powerful method that can take many kinds of arguments. For this post, we’ll be focusing mainly on passing in custom-built Arel queries.

Note that if you used Arel::Table.new, Rails 5.0 with Arel 7.1 raises a depreciation message:

Arel performing automatic type casting is deprecated and will be removed in Arel 8.0. If you are seeing this, it is because you are manually passing a value to an Arel predicate, and the Arel::Table object was constructed manually. The easiest way to remove this warning is to use an Arel::Table object returned from calling arel_table on an ActiveRecord::Base subclass.

If you’re certain the value is already of the right type, change attribute.eq(value) to attribute.eq(Arel::Nodes::Quoted.new(value)). (You will be able to remove that in Arel 8.0; it is only required to silence this deprecation warning.)

You can also silence this warning globally by setting $arel_silence_type_casting_deprecation to true. (Do NOT do this if you are a library author.)

If you are passing user input to a predicate, you must either give an appropriate type caster object to the Arel::Table or manually cast the value before passing it to Arel.

So as long as you manually type cast values, you may still use Arel::Table.new

For a list of the predications you can use, you can call instance_methods on Arel::Predications.

# Rails 4.1 (Arel 5.0)
Arel::Predications.instance_methods
[:in, :matches, :eq, :lt, :not_eq, :not_eq_any, :not_eq_all, :eq_any,
:eq_all, :in_any, :in_all, :not_in, :not_in_any, :not_in_all, :matches_any
:matches_all, :does_not_match, :does_not_match_any, :does_not_match_all,
:gteq, :gteq_any, :gteq_all, :gt, :gt_any, :gt_all, :lt_any, :lt_all, :lteq,
:lteq_any, :lteq_all, :eql_any]

# Rails 4.2 (Arel 6.0) adds the following
[:between, :not_between]

# Rails 5.0 (Arel 7.1) adds the following

Most of these predication methods are self-descriptive. lt is for less-than and gteq is for greater-than or equal-to. So figuring out how to use them is pretty straightforward.

Address.where( Address.arel_table[:created_at].lt( 5.days.ago ) ).to_sql
# SELECT "addresses".* FROM "addresses"  WHERE ("addresses"."created_at" < '2016-12-13 03:31:23.911914')

To chain more than one kind of query, you may use instance methods from Arel::Nodes::Node.

Arel::Nodes::Node.instance_methods - Enumerable.instance_methods - Object.methods
# => [:each, :to_sql, :not, :or, :and, :lower, :create_true, :create_false,
# ...

The most common methods you’ll use from this are not, or, and and.

veg = Arel::Table.new(:vegetables)

query = veg[:created_at].gteq( 5.days.ago ).and(
  veg[:color].eq("green").or(
    veg[:gardener].eq("Susan")
  )
)

query.to_sql
#  "vegetables"."created_at" >= '2016-12-13 03:54:28.575342'
#    AND ("vegetables"."color" = 'green' OR "vegetables"."gardener" = 'Susan')

Vegetable.where( query )

The rest of the SQL AST methods you might use are found on Arel::SelectManager.

Arel::SelectManager.instance_methods - Object.methods
# => [:take, :join, :union, :source, :offset, :skip, :group, :lock, :from, :limit,
#     :on, :with, :order, :locked, :exists, :except, :orders, :froms, :distinct, :as,
#     :outer_join, :project, :having, :projections, :projections=, :limit=, :offset=,
# ...

Here you have everything you’ll need for joining tables in the database along with limitations and ordering. You can find examples for this, and much more, in Arel’s GitHub readme.

Rails 5 has added the or method on regular Active Record where queries, but the same or queries have been with Arel for quite some time. The behavior for or is the same for both implementations.

The Cost of Using a Private API

Using a private API comes at the cost of breaking changes when upgrading. One place I felt this personally was when I upgraded from Rails 4.1 to 4.2. Arel has gone through changes quite often and will likely keep up this trend. In this particular upgrade, the behavior for Arel::Predication#eq_any changed to no longer accept an empty collection. So when code like this worked in 4.1 (a contrived example):

current_events = Event.current.ids

events = Arel::Table.new("events")
query = events[:user_id].eq(1).or(
  events[:id].eq_any(current_events)
)

Event.where( query )
# Rails 4.1 Success!
# Rails 4.2: RuntimeError: unsupported: NilClass
# Rails 5.0: Arel::Visitors::UnsupportedVisitError: Unsupported argument type: NilClass. Construct an Arel node instead.

It would crash in 4.2 whenever there were no current events. Rails 5 was nice enough to give us a clearer error message for what was expected. So this is the price you will likely pay should you depend on a private API and still plan to upgrade.

The good news is that it’s not that hard to look into the source code and write our own method that won’t change. This will be more far more stable as only a major rewrite of Arel internals can break it.

First we need to look at the behavior of Rails 4.1’s Arel.

a = Arel::Table.new(:things)

a[:id].eq_any([]).to_sql
# (NULL)
a[:id].eq_any([1,2]).to_sql
# ("things"."id" = 1 OR "things"."id" = 2)

We can see that the only SQL string generated from an empty collection to eq_any is “(NULL)”. This is the only behavior that changed, so we just need to implement that string in the same way Arel would. And with Arel, that’s a SQL literal.

# Original eq_any method
def eq_any others
  grouping_any :eq, others
end

# Our custom equal any method
module ::Arel
  module Predications
    def eql_any others
      if others.length.zero?
        Nodes::SqlLiteral.new "(NULL)"
      else
        grouping_any :eq, others
      end
    end
  end
end

Now we change our code to use eql_any instead of eq_any, and we will likely keep the same behavior from now on.

An alternative way of working with the earlier eq_any example is to use in instead. This will produce the same results desired in this situation, and the behavior hasn’t changed thus far from Rails 4.1 through 5.0. When an empty collection is given, it uses an SQL literal of 1=0, which since it is a false statement will not return any additional data from the query. This is the desired behavior from eq_any.

query = events[:user_id].eq(1).or(
  events[:id].in(current_events)
)

Digging Further into Arel

A couple of years ago, Cameron Dutro gave an excellent Rails Conference talk called Advanced aRel: When ActiveRecord Just Isn’t Enough, which I recommend that you take the time to watch. He’s also written a gem to get a lot more Arel code accomplished with less in the arel-helpers gem.

And lastly, if you have an SQL query you’d like to convert over to Arel, he’s written a web service that will do that for you at scuttle.io. As mentioned earlier, many examples of using Arel are available in their own readme.

Summary

This article has only lightly touched on Arel. Over time the recommended ways to use it have changed and I suspect will continue to change. Regardless of these changes, I fully believe the rewards of using Arel far outweigh the costs. Whether you’re a professional with SQL or have no knowledge of it whatsoever, Arel is an excellent AST to get you quickly building even the most complicated queries.

Arel is definitely worth taking a small amount of time to get the hang of. With it, you can remove clusters of string SQL queries that make for difficult legibility, building-upon, and testing, and now you can have clarity and ease of use all around.

Subscribe via Email

Over 60,000 people from companies like Netflix, Apple, Spotify and O'Reilly are reading our articles.
Subscribe to receive a weekly newsletter with articles around Continuous Integration, Docker, and software development best practices.



We promise that we won't spam you. You can unsubscribe any time.

Join the Discussion

Leave us some comments on what you think about this topic or if you like to add something.

  • estum

    But just with pure SQL this will be much more easier, shorter and [human-]readable:


    Vegetable.where("created_at >= ? AND (color = ? OR gardener = ?)", 5.days.ago, 'green', 'Susan')

    then arel constructions:


    veg = Arel::Table.new(:vegetables)

    query = veg[:created_at].gteq( 5.days.ago ).and(
    veg[:color].eq("green").or(
    veg[:gardener].eq("Susan")
    )
    )

    Vegetable.where( query )

    So why do overwork?!

    • The main point is to understand how it works. Also using Arel is database agnostic so you don’t have to worry about differences between databases. And an even better use for it is to write code builders or generators that can dynamically create queries.

      Николай has also given you a good response here.

  • Николай

    @estum:disqus
    The following queries:

    scope :created_after, ->(date) { arel_table[:created_at].gteq date }
    scope :color_or_gardener, ->(co, ga) { arel_table[:color].eq(co).or(arel_table[:gardener].eq(ga)) }

    can be useful in case of you need to use those queries alone or together

    vegetables = Vegetable.created_after(Date.today)
    vegetables.color_or_gardener(“greeen”, “Susan”)

    It’s more the Rails way, when naming convention (scopes) is over configuration (sql code)