How Solid Queue works under the hood

Whether or not you're active in the Rails ecosystem, you might already have heard some of the buzz around Solid Queue, a new database-backed backend for ActiveJob. Solid Queue is a simple and performant option for background jobs that lets you queue large amounts of data without maintaining extra dependencies like Redis.

We've already talked about how to deploy, run, and monitor Solid Queue, but we haven't yet explored how Solid Queue works. In this article, we'll explore the implementation of Solid Queue. We'll follow a Solid Queue job from start to finish by digging into what happens when you enqueue a job with ActiveJob, examining the many database tables, and exploring how jobs get executed.

How Solid Queue manages state

Solid Queue leans on several different tables to manage jobs throughout their lifecycle. It can be a lot to keep track of! Knowing what each table represents and how Solid Queue uses them to keep track of state can be helpful for debugging. To start, here's a quick list of the tables:

solid_queue_jobs
solid_queue_scheduled_executions
solid_queue_recurring_executions
solid_queue_ready_executions
solid_queue_claimed_executions
solid_queue_failed_executions
solid_queue_pauses
solid_queue_blocked_executions
solid_queue_semaphores
solid_queue_processes

Rails applications don't get these tables by default; Solid Queue creates a migration file when you run the install command:

bin/rails generate solid_queue:install
bin/rails db:migrate

The first command copies the database migrations from Solid Queue into your app's db/migrations/ directory, and the second command creates the tables in your database.

Breaking down each database table

The solid_queue_jobs is responsible for keeping track of each job. This table will grow rather large because by default, jobs stay in the table after completion (with a finished_at flag). Most of the other tables reference the jobs in this table, so they don't have to duplicate the details of each job.

Solid Queue tracks scheduled and recurring jobs using the solid_queue_scheduled_executions and solid_queue_recurring_executions tables. Dispatcher processes select the jobs and move them to the solid_queue_ready_executions table, which keeps track of jobs that should run immediately.

Solid Queue tries to keep the solid_queue_ready_executions as small as possible; this is by design—in fact, the author of GoodJob had some nice things to say about Solid Queue and pointed out that this design choice makes the solid_queue_ready_executions table more performant.

When a worker picks up a job to execute, it inserts a record of the job into the solid_queue_claimed_executions table so other workers running in parallel don't pull the same job. If the job fails, it creates a record of the failure in the solid_queue_failed_executions table. After completing the job, Solid Queue removes it from solid_queue_claimed_executions and updates the status in the solid_queue_jobs table.

The solid_queue_pauses table tracks if a named queue has been "paused." Solid Queue adds a record to this table when calling Queue#pause; calling Queue#resume removes the record.

Finally, Solid Queue uses the solid_queue_blocked_executions and solid_queue_semaphores tables to manage concurrency, and the solid_queue_processes table keeps track of the supervisor's processes. Processes are either workers or dispatchers, as both classes inherit from Solid Queue's Process class.

Lifecycle of a Solid Queue job

Now that you understand how Solid Queue uses each table to manage state, we'll trace a job from enqueue through execution.

ActiveJob enqueues the job

Let's say we have a class for a background job, UserWelcomeJob, that looks like this:

class UserWelcomeJob < ApplicationJob
  queue_as :default

  def perform(user)
    puts "Welcome, #{user.name}"
  end
end

To invoke this job asynchronously, you would call:

UserWelcomeJob.perform_later(user)

But what happens next? How does a record of the job get created, and how does the job get executed?

First, ActiveJob serializes the argument(s) passed to the #perform method. In this case, user is an ActiveRecord object, which gets serialized with GlobalID. GlobalID is pretty cool; it converts any object that responds to #to_gid into a URI, which looks like gid://your-app-name/User/1. When a worker process performs the job, it uses the URI to locate the record.

Next, the perform_later calls an enqueue method on the job. If your ActiveJob adapter is solid_queue, the adapter calls Solid Queue's enqueue method—this is where things get interesting.

Solid Queue saves the job to the database

SolidQueue's Job#enqueue method is pretty straightforward:

def enqueue(active_job, scheduled_at: Time.current)
  active_job.scheduled_at = scheduled_at

  create_from_active_job(active_job).tap do |enqueued_job|
    active_job.provider_job_id = enqueued_job.id
  end
end

The #enqueue method updates the job's #scheduled_at attribute with the scheduled_at argument, which defaults to the current time. Then, it adds the job to the solid_queue_jobs database table.

SolidQueue's Job model includes SolidQueue::Job::Executable, which calls #prepare_for_execution after the job is created:

def prepare_for_execution
  if due? then dispatch
  else
    schedule
  end
end

If the job is due now, Solid Queue adds it to the solid_queue_ready_executions table; otherwise, it adds it to solid_queue_scheduled_executions, where a dispatcher process later selects the jobs that are due and moves them to the solid_queue_ready_executions table.

In the case of the UserWelcomeJob we enqueued earlier, the dispatcher adds it directly to the solid_queue_ready_executions table.

A worker executes the job

When you start Solid Queue, it starts a supervisor process that manages both the worker and dispatcher processes:

bundle exec rake solid_queue:start

Each worker process polls the solid_queue_ready_executions table, waiting to claim a job. When a worker claims the job, it's moved to the solid_queue_claimed_executions table.

If the job executes successfully, one of two things happens:

If Solid Queue's preserve_finished_jobs config option is true (the default), it updates the row's finished_at column in the solid_queue_jobs table.
If preserve_finished_jobs is false, Solid Queue destroys the row in solid_queue_jobs.

In the next section, we'll explore what happens if a job fails.

Handling failed jobs

When a job fails, Solid Queue adds it to the solid_queue_failed_executions table. These failed execution records stay around until they're retried by ActiveJob or manually deleted.

Mission Control lets you manually retry or discard these failed jobs, but Solid Queue does not handle retries in any particular way, and ActiveJob doesn't retry jobs by default. If you'd like to automatically retry a job when an exception is raised, use retry_on in the job class.

Of course, monitoring Solid Queue's failed executions table to see which jobs failed is not ideal. You might prefer an exception tracking service like Honeybadger to get notified immediately. Honeybadger automatically reports exceptions when Solid Queue jobs fail so that you can quickly get notified and address any issues. Honeybadger's Ruby gem includes an ActiveJob plugin that handles error reporting and monitors Solid Queue performance metrics.

Understanding the Solid Queue lifecycle

Solid Queue simplifies background jobs by eliminating the need for external dependencies and provides a transparent architecture for job lifecycle management. Still, the journey of a Solid Queue job from enqueue to completion is not straightforward.

Understanding how Solid Queue works under the hood is a superpower for debugging and improving your Rails apps in production. Now that you know how Solid Queue processes jobs and handles failures, you'll know exactly where to look the next time a job goes missing.