How to Build Powerful Search for Your Ruby on Rails App

Search is an essential feature for many web applications. A study by the Nielsen Norman Group found that more than half of web users rely on search as their primary means of navigation. For e-commerce sites, up to 30% of traffic lands directly on the search results page.

As a full-stack Rails developer, you need to know how to implement high-quality search that helps your users find what they need quickly and easily. In this guide, we‘ll walk through the process of adding search to a Rails application, step-by-step.

We‘ll cover both basic search techniques using SQL queries and ActiveRecord, as well as more advanced solutions using the open-source Elasticsearch search engine. By the end of this post, you‘ll have all the knowledge you need to build fast, relevant search for your Rails app.

Basic Search Using SQL and ActiveRecord

If your app has a relatively small amount of data and simple search requirements, you can get by with basic SQL queries using ActiveRecord.

Let‘s say you have a Product model and you want to allow users to search for products by name. Here‘s how you could implement that in your controller:

class ProductsController < ApplicationController
  def index
    if params[:query].present?
      @products = Product.where("name ILIKE ?", "%#{params[:query]}%")
    else
      @products = Product.all
    end
  end
end

And in your view:

<%= form_with(url: products_path, method: :get, local: true) do |form| %>
  <%= form.label :query, "Search for:" %>
  <%= form.text_field :query %>
  <%= form.submit "Search" %>
<% end %>

<ul>
  <% @products.each do |product| %>
    <li><%= product.name %></li>
  <% end %>
</ul>

This uses ActiveRecord‘s where method to find products where the name matches the search query. The ILIKE operator is a case-insensitive version of LIKE, which allows you to match partial strings using the % wildcard.

You can expand the search to additional fields by chaining multiple where clauses together:

@products = Product.where("name ILIKE ?", "%#{params[:query]}%")
                   .or(Product.where("description ILIKE ?", "%#{params[:query]}%"))

For more advanced querying, you can use other ActiveRecord methods like order, limit, offset, group, having, and more. Consult the Rails Query Interface Guide for full details on what‘s possible.

However, this basic approach has some limitations:

  1. Lack of relevance ranking. With simple string matching, every result is scored equally. There‘s no way to indicate that some matches are more relevant than others.

  2. Slow performance for large data sets. As the number of records grows into the hundreds of thousands or millions, querying the main database for every search becomes too slow for real-time usage.

  3. Inflexible querying. SQL LIKE queries only allow for simple partial string matching. You can‘t easily handle more complex needs like phrase matching, stemming, fuzzy matching, etc.

  4. No support for language-specific features. Things like word stemming, synonyms, and stop word removal require additional libraries.

While basic SQL search may suffice for some applications, many will outgrow it and need a more robust solution using a dedicated search engine like Elasticsearch.

Full-Text Search with Elasticsearch

Elasticsearch is an open-source search and analytics engine built on the Apache Lucene library. It enables fast full-text search across large data sets in near real-time.

Some key features that make Elasticsearch a great fit for search in a Rails app:

  • Scalable, distributed architecture that can handle billions of documents
  • Powerful query DSL for fine-grained searches
  • Relevance scoring and boosting
  • Flexible mapping and analysis to handle different data types and languages
  • Highlighting, suggestions, and aggregations
  • Easy integration with Rails via the elasticsearch-rails gem

Here‘s how to set up Elasticsearch in a Rails app:

First, install and run Elasticsearch on your machine or server. On a Mac with Homebrew, it‘s as simple as:

brew install elasticsearch
brew services start elasticsearch

Next, add the elasticsearch-rails and elasticsearch-model gems to your Gemfile:

gem ‘elasticsearch-model‘
gem ‘elasticsearch-rails‘

Run bundle install to install the new gems.

Include the Elasticsearch::Model functionality in the models you want to search:

class Product < ApplicationRecord
  include Elasticsearch::Model
  include Elasticsearch::Model::Callbacks
end

This will extend your model with Elasticsearch-related methods and automatically sync any changes to the model with the Elasticsearch index.

To customize your model‘s Elasticsearch mapping, create a new file app/models/concerns/searchable.rb:

module Searchable
  extend ActiveSupport::Concern

  included do
    include Elasticsearch::Model
    include Elasticsearch::Model::Callbacks

    settings index: { number_of_shards: 1 } do
      mappings dynamic: false do
        indexes :name, type: :text, analyzer: :english
        indexes :description, type: :text, analyzer: :english
      end
    end
  end
end

This defines custom mappings for the name and description fields with an English-language analyzer. Include this concern in your model:

class Product < ApplicationRecord
  include Searchable
end

Once your mappings are set up, use the import method to index your existing data:

Product.import

This will send all your product records to the Elasticsearch server to be indexed. For large data sets, you can also import in batches to avoid out-of-memory issues:

Product.import(batch_size: 1000)

With your data indexed, you can now search it using the Elasticsearch query DSL. For example, to perform a simple full-text search across all fields:

Product.search(query: { multi_match: { query: params[:query] } }).records

This will return an Elasticsearch::Model::Response::Records object that you can iterate over like an Active Record relation.

The elasticsearch-model gem provides a full query DSL for constructing more advanced queries. Some common query types:

  • match: searches analyzed text fields for matching terms
  • match_phrase: searches analyzed text fields for matching phrases
  • multi_match: searches multiple fields for matching terms
  • bool: composes smaller queries into boolean logic (must, must_not, should, etc)

Consult the elasticsearch-model documentation for full details on the available options.

To keep your search lightning fast as you scale, Elasticsearch has several tricks up its sleeve:

  • Distributed inverted index for fast lookups
  • Optimization of disk and memory usage
  • Query and filter caching
  • Horizontal scaling across multiple nodes

As a developer, you don‘t need to worry about most of the low-level details. The elasticsearch-model gem handles the details of communicating with your Elasticsearch cluster.

Keeping Your Index in Sync

One challenge with using a separate search index like Elasticsearch is keeping it in sync with your primary database. The elasticsearch-model gem helps by automatically updating the index whenever a record is created, updated, or deleted via model callbacks.

However, if you ever modify your data in bulk or outside of Active Record, you‘ll need to manually re-sync the index. You can do this by calling the import method again:

Product.import

For large databases, a full re-index can take a while and put strain on your Elasticsearch server. In those cases, it‘s better to use Elasticsearch‘s scrolling API to process documents in small batches:

Product.import(batch_size: 1000, scroll: ‘5m‘)

This will scroll through the index in batches of 1000 records, with each batch‘s search context kept open for 5 minutes.

Another approach is to update the index incrementally after a bulk update. If you have a timestamp indicating when each record was last updated, you can use that to only import records that have changed since the last sync:

Product.import(query: -> { where("updated_at > ?", Time.now - 1.day) }) 

Keeping Your Search Feeling Fresh with Index Aliases

Even with incremental syncing, there may still be a slight delay between when a record is updated in your primary database and when it‘s reflected in the search results.

To avoid this issue and keep your search real-time, you can use Elasticsearch‘s index aliasing feature. With index aliases, you can seamlessly swap between a "live" index and a "staging" index.

The idea is to perform all writes (creates, updates, deletes) to the staging index. Then, in a single atomic operation, update the alias to point to the staging index instead of the live index. This makes the changes visible to search queries instantly.

Here‘s an example of how you could implement index aliasing in a Rails app:

# Gemfile
gem ‘chewy‘
# app/chewy/products_index.rb
class ProductsIndex < Chewy::Index
  settings number_of_shards: 1

  define_type Product do
    field :name
    field :description
  end

  def self.live_alias
    "products-live"
  end

  def self.staging_alias
    "products-staging"
  end
end
# app/models/product.rb
class Product < ApplicationRecord
  update_index(‘products#product‘) { self } 
end
# app/controllers/products_controller.rb
class ProductsController < ApplicationController
  def index
    @products = ProductsIndex::Product.query(match: { _all: params[:query] })
                                       .objects
                                       .order(updated_at: :desc)
                                       .page(params[:page])
  end
end

In this example, we‘re using the Chewy gem to define our Elasticsearch index and mappings. The ProductsIndex class defines two aliases: "products-live" and "products-staging".

In the Product model, we use the update_index method to sync any changes to the product with the staging index.

Finally, in the ProductsController, we query the "products-live" alias to fetch the search results.

With this setup, all writes go to the staging index, while reads go to the live index. To make staged changes live, we can use Elasticsearch‘s _aliases API to atomically update the aliases:

curl -XPOST ‘localhost:9200/_aliases‘ -d ‘ 
{
    "actions" : [
        { "remove" : { "index" : "products-live", "alias" : "products" } },
        { "add" : { "index" : "products-staging", "alias" : "products" } }
    ]
}‘

This swaps the "products" alias from the old live index to the new staging index, making the staged changes live.

There are a few different ways to automate this aliasing process, such as:

  • A background job that runs on a regular interval
  • A webhook that triggers the update after a certain number of changes are staged
  • Manually kick off the update as part of your deploy process

Which approach you choose depends on your app‘s specific needs and tolerances around data freshness and consistency.

Measuring and Tuning Relevance

Once you have the basic search functionality working in your app, a big part of the job is tuning the results to be as relevant as possible to your users. After all, what good is a search feature if it doesn‘t return the best matches first?

There are a few key factors that influence search relevance:

  • The fields you‘re searching across. Are you searching the fields that contain terms users are most likely to search for?
  • The boosts you give to different fields. For example, exact title matches might be more important than matches in the description.
  • The analyzer used for each field. This determines how the field‘s data is processed and tokenized before being searched.
  • The relevance scoring algorithm. By default, Elasticsearch uses a variant of the TF/IDF algorithm called Okapi BM25.

To get a sense of how relevant your search results are, it helps to have a representative set of test searches that you can run through your system. Look at the top 10-20 results for each query and ask:

  • Are the most relevant documents near the top?
  • Are there any irrelevant results that shouldn‘t be included?
  • Are key results missing entirely?

Keep a log of these test cases along with your relevance judgments. That way when you make changes to your search configuration, you can ensure you‘re moving the relevance needle in the right direction and not regressing.

There are also some quantitative metrics you can calculate to get a more objective view of your search relevance:

  • Precision: The fraction of documents in the result set that are relevant. Precision = (# of relevant docs) / (# of docs in result set)
  • Recall: The fraction of all relevant documents that are returned in the result set. Recall = (# of relevant docs in result set) / (total # of relevant docs)
  • F-measure: The harmonic mean of precision and recall. F-measure = 2 (precision recall) / (precision + recall)
  • Mean Average Precision (MAP): The average of the precision scores at each relevant document in the result set, averaged over all queries.

To calculate these metrics, you‘ll need a labeled test set of queries and relevance judgments. You can then use a tool like Quepid to track your metrics over time as you tune your search.

Conclusion

Adding search to a Rails app is a great way to make your data more discoverable and engaging for users. For simple needs, you can get by with basic SQL searches using Active Record. But for more advanced use cases, you‘ll want to reach for a dedicated search index like Elasticsearch.

With Elasticsearch, you get fast, scalable, full-text search out of the box. And thanks to the elasticsearch-rails gem, integrating it with your Rails models is relatively straightforward.

For the best search experience, you‘ll want to pay attention to factors like:

  • Keeping your search index in sync with your primary database
  • Using index aliases for real-time updates
  • Tuning the relevance of your search results
  • Measuring and tracking relevance metrics over time

By following the techniques outlined in this guide, you‘ll be well on your way to building a powerful, relevant search experience that your users will love. So what are you waiting for? Get out there and build some search!

Similar Posts