Resque

Resque (pronounced like "rescue") is a Redis-backed library for creating background jobs, placing those jobs on multiple queues, and processing them later.

Background jobs can be any Ruby class or module that responds to perform. Your existing classes can easily be converted to background jobs or you can create new classes specifically to do work. Or, you can do both.

Resque is heavily inspired by DelayedJob (which rocks) and comprises three parts:

A Ruby library for creating, querying, and processing jobs
A Rake task for starting a worker which processes jobs
A Sinatra app for monitoring queues, jobs, and workers.

Resque workers can be distributed between multiple machines, support priorities, are resilient to memory bloat / "leaks," are optimized for REE (but work on MRI and JRuby), tell you what they're doing, and expect failure.

Resque queues are persistent; support constant time, atomic push and pop (thanks to Redis); provide visibility into their contents; and store jobs as simple JSON packages.

The Resque frontend tells you what workers are doing, what workers are not doing, what queues you're using, what's in those queues, provides general usage stats, and helps you track failures.

The Blog Post

For the backstory, philosophy, and history of Resque's beginnings, please see the blog post.

Overview

Resque allows you to create jobs and place them on a queue, then, later, pull those jobs off the queue and process them.

Resque jobs are Ruby classes (or modules) which respond to the perform method. Here's an example:

class Archive
  @queue = :file_serve

  def self.perform(repo_id, branch = 'master')
    repo = Repository.find(repo_id)
    repo.create_archive(branch)
  end
end

The @queue class instance variable determines which queue Archive jobs will be placed in. Queues are arbitrary and created on the fly - you can name them whatever you want and have as many as you want.

To place an Archive job on the file_serve queue, we might add this to our application's pre-existing Repository class:

class Repository
  def async_create_archive(branch)
    Resque.enqueue(Archive, self.id, branch)
  end
end

Now when we call repo.async_create_archive('masterbrew') in our application, a job will be created and placed on the file_serve queue.

Later, a worker will run something like this code to process the job:

klass, args = Resque.reserve(:file_serve)
klass.perform(*args) if klass.respond_to? :perform

Which translates to:

Archive.perform(44, 'masterbrew')

Let's start a worker to run file_serve jobs:

$ cd app_root
$ QUEUE=file_serve rake resque:work

This starts one Resque worker and tells it to work off the file_serve queue. As soon as it's ready it'll try to run the Resque.reserve code snippet above and process jobs until it can't find any more, at which point it will sleep for a small period and repeatedly poll the queue for more jobs.

Workers can be given multiple queues (a "queue list") and run on multiple machines. In fact they can be run anywhere with network access to the Redis server.

Jobs

What should you run in the background? Anything that takes any time at all. Slow INSERT statements, disk manipulating, data processing, etc.

At GitHub we use Resque to process the following types of jobs:

Warming caches
Counting disk usage
Building tarballs
Building Rubygems
Firing off web hooks
Creating events in the db and pre-caching them
Building graphs
Deleting users
Updating our search index

As of writing we have about 35 different types of background jobs.

Keep in mind that you don't need a web app to use Resque - we just mention "foreground" and "background" because they make conceptual sense. You could easily be spidering sites and sticking data which needs to be crunched later into a queue.

Persistence

Jobs are persisted to queues as JSON objects. Let's take our Archive example from above. We'll run the following code to create a job:

repo = Repository.find(44)
repo.async_create_archive('masterbrew')

The following JSON will be stored in the file_serve queue:

{
    'class': 'Archive',
    'args': [ 44, 'masterbrew' ]
}

Because of this your jobs must only accept arguments that can be JSON encoded.

So instead of doing this:

Resque.enqueue(Archive, self, branch)

do this:

Resque.enqueue(Archive, self.id, branch)

This is why our above example (and all the examples in examples/) uses object IDs instead of passing around the objects.

While this is less convenient than just sticking a marshaled object in the database, it gives you a slight advantage: your jobs will be run against the most recent version of an object because they need to pull from the DB or cache.

If your jobs were run against marshaled objects, they could potentially be operating on a stale record with out-of-date information.

send_later / async

Want something like DelayedJob's send_later or the ability to use instance methods instead of just methods for jobs? See the examples/ directory for goodies.

We plan to provide first class async support in a future release.

Failure

If a job raises an exception, it is logged and handed off to the Resque::Failure module. Failures are logged either locally in Redis or using some different backend.

For example, Resque ships with Airbrake support. To configure it, put the following into an initialisation file or into your rake job:

# send errors which occur in background jobs to redis and airbrake
require 'resque/failure/multiple'
require 'resque/failure/redis'
require 'resque/failure/airbrake'

Resque::Failure::Multiple.classes = [Resque::Failure::Redis, Resque::Failure::Airbrake]
Resque::Failure.backend = Resque::Failure::Multiple

Keep this in mind when writing your jobs: you may want to throw exceptions you 8000 would not normally throw in order to assist debugging.

Workers

Resque workers are rake tasks that run forever. They basically do this:

start
loop do
  if job = reserve
    job.process
  else
    sleep 5 # Polling frequency = 5
  end
end
shutdown

Starting a worker is simple. Here's our example from earlier:

$ QUEUE=file_serve rake resque:work

By default Resque won't know about your application's environment. That is, it won't be able to find and run your jobs - it needs to load your application into memory.

If we've installed Resque as a Rails plugin, we might run this command from our RAILS_ROOT:

$ QUEUE=file_serve rake environment resque:work

This will load the environment before starting a worker. Alternately we can define a resque:setup task with a dependency on the environment rake task:

task "resque:setup" => :environment

GitHub's setup task looks like this:

task "resque:setup" => :environment do
  Grit::Git.git_timeout = 10.minutes
end

We don't want the git_timeout as high as 10 minutes in our web app, but in the Resque workers it's fine.

Logging

Workers support basic logging to STDOUT. If you start them with the VERBOSE env variable set, they will print basic debugging information. You can also set the VVERBOSE (very verbose) env variable.

$ VVERBOSE=1 QUEUE=file_serve rake environment resque:work

If you want Resque to log to a file, in Rails do:

# config/initializers/resque.rb
Resque.logger = Logger.new(Rails.root.join('log', "#{Rails.env}_resque.log"))

Process IDs (PIDs)

There are scenarios where it's helpful to record the PID of a resque worker process. Use the PIDFILE option for easy access to the PID:

$ PIDFILE=./resque.pid QUEUE=file_serve rake environment resque:work

Running in the background

(Only supported with ruby >= 1.9). There are scenarios where it's helpful for the resque worker to run itself in the background (usually in combination with PIDFILE). Use the BACKGROUND option so that rake will return as soon as the worker is started.

$ PIDFILE=./resque.pid BACKGROUND=yes QUEUE=file_serve \
    rake environment resque:work

Polling frequency

You can pass an INTERVAL option which is a float representing the polling frequency. The default is 5 seconds, but for a semi-active app you may want to use a smaller value.

$ INTERVAL=0.1 QUEUE=file_serve rake environment resque:work

Priorities and Queue Lists

Resque doesn't support numeric priorities but instead uses the order of queues you give it. We call this list of queues the "queue list."

Let's say we add a warm_cache queue in addition to our file_serve queue. We'd now start a worker like so:

$ QUEUES=file_serve,warm_cache rake resque:work

When the worker looks for new jobs, it will first check file_serve. If it finds a job, it'll process it then check file_serve again. It will keep checking file_serve until no more jobs are available. At that point, it will check warm_cache. If it finds a job it'll process it then check file_serve (repeating the whole process).

In this way you can prioritize certain queues. At GitHub we start our workers with something like this:

$ QUEUES=critical,archive,high,low rake resque:work

Notice the archive queue - it is specialized and in our future architecture will only be run from a single machine.

At that point we'll start workers on our generalized background machines with this command:

$ QUEUES=critical,high,low rake resque:work

And workers on our specialized archive machine with this command:

$ QUEUE=archive rake resque:work

Running All Queues

If you want your workers to work off of every queue, including new queues created on the fly, you can use a splat:

$ QUEUE=* rake resque:work

Queues will be processed in alphabetical order.

Running Multiple Workers

At GitHub we use god to start and stop multiple workers. A sample god configuration file is included under examples/god. We recommend this method.

If you'd like to run multiple workers in development mode, you can do so using the resque:workers rake task:

$ COUNT=5 QUEUE=* rake resque:workers

This will spawn five Resque workers, each in its own process. Hitting ctrl-c should be sufficient to stop them all.

Forking

On certain platforms, when a Resque worker reserves a job it immediately forks a child process. The child processes the job then exits. When the child has exited successfully, the worker reserves another job and repeats the process.

Why?

Because Resque assumes chaos.

Resque assumes your background workers will lock up, run too long, or have unwanted memory growth.

If Resque workers processed jobs themselves, it'd be hard to whip them into shape. Let's say one is using too much memory: you send it a signal that says "shutdown after you finish processing the current job," and it does so. It then starts up again - loading your entire application environment. This adds useless CPU cycles and causes a delay in queue processing.

Plus, what if it's using too much memory and has stopped responding to signals?

Thanks to Resque's parent / child architecture, jobs that use too much memory release that memory upon completion. No unwanted growth.

And what if a job is running too long? You'd need to kill -9 it then start the worker again. With Resque's parent / child architecture you can tell the parent to forcefully kill the child then immediately start processing more jobs. No startup delay or wasted cycles.

The parent / child architecture helps us keep tabs on what workers are doing, too. By eliminating the need to kill -9 workers we can have parents remove themselves from the global listing of workers. If we just ruthlessly killed workers, we'd need a separate watchdog process to add and remove them to the global listing - which becomes complicated.

Workers instead handle their own state.

Parents and Children

Here's a parent / child pair doing some work:

$ ps -e -o pid,command | grep [r]esque
92099 resque: Forked 92102 at 1253142769
92102 resque: Processing file_serve since 1253142769

You can clearly see that process 92099 forked 92102, which has been working since 1253142769.

(By advertising the time they began processing you can easily use monit or god to kill stale workers.)

When a parent process is idle, it lets you know what queues it is waiting for work on:

$ ps -e -o pid,command | grep [r]esque
92099 resque: Waiting for file_serve,warm_cache

Signals

Resque workers respond to a few different signals:

QUIT - Wait for child to finish processing then exit
TERM / INT - Immediately kill child then exit
USR1 - Immediately kill child but don't exit
USR2 - Don't start to process any new jobs
CONT - Start to process new jobs again after a USR2

If you want to gracefully shutdown a Resque worker, use QUIT.

If you want to kill a stale or stuck child, use USR1. Processing will continue as normal unless the child was not found. In that case Resque assumes the parent process is in a bad state and shuts down.

If you want to kill a stale or stuck child and shutdown, use TERM

If you want to stop processing jobs, but want to leave the worker running (for example, to temporarily alleviate load), use USR2 to stop processing, then CONT to start it again.

Mysql::Error: MySQL server has gone away

If your workers remain idle for too long they may lose their MySQL connection. Depending on your version of Rails, we recommend the following:

Rails 3.x

In your perform method, add the following line:

class MyTask
  def self.perform
    ActiveRecord::Base.verify_active_connections!
    # rest of your code
  end
end

The Rails doc says the following about verify_active_connections!:

Verify active connections and remove and disconnect connections associated with stale threads.

Rails 4.x

In your perform method, instead of verify_active_connections!, use:

class MyTask
  def self.perform
    ActiveRecord::Base.clear_active_connections!
    # rest of your code
  end
end

From the Rails docs on clear_active_connections!:

Returns any connections in use by the current thread back to the pool, and also returns connections to the pool cached by threads that are no longer alive.

The Front End

Resque comes with a Sinatra-based front end for seeing what's up with your queue.

Standalone

If you've installed Resque as a gem running the front end standalone is easy:

$ resque-web

It's a thin layer around rackup so it's configurable as well:

$ resque-web -p 8282

If you have a Resque config file you want evaluated just pass it to the script as the final argument:

< 6D40 div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ resque-web -p 8282 rails_root/config/initializers/resque.rb">

$ resque-web -p 8282 rails_root/config/initializers/resque.rb

Name		Name	Last commit message	Last commit date
Latest commit History 1,528 Commits
Gemfiles		Gemfiles
bin		bin
docs		docs
examples		examples
lib		lib
log		log
test		test
.gitignore		.gitignore
.kick		.kick
.simplecov		.simplecov
.travis.yml		.travis.yml
CONDUCT.md		CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Gemfile		Gemfile
HISTORY.md		HISTORY.md
LICENSE		LICENSE
README.markdown		README.markdown
Rakefile		Rakefile
config.ru		config.ru
init.rb		init.rb
resque.gemspec		resque.gemspec

License

enigmo/resque

Folders and files

Latest commit

History

Repository files navigation

Resque

The Blog Post

Overview

Jobs

Persistence

send_later / async

Failure

Workers

Logging

Process IDs (PIDs)

Running in the background

Polling frequency

Priorities and Queue Lists

Running All Queues

Running Multiple Workers

Forking

Parents and Children

Signals

Mysql::Error: MySQL server has gone away

Rails 3.x

Rails 4.x

The Front End

Standalone

Passenger

Rack::URLMap

Rails 3

Resque vs DelayedJob

Resque Dependencies

Installing Resque

In a Rack app, as a gem

In a Rails 2.x app, as a gem

In a Rails 2.x app, as a plugin

In a Rails 3.x or 4.x app, as a gem

Configuration

Plugins and Hooks

Namespaces

Demo

Monitoring

god

monit

Questions

Development

Contributing

Mailing List

Meta

Author

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages