Stop Calling PostgREST "Magic" – A Deep Dive Into How It Really Works

If you‘ve been following trends in API design, you‘ve likely heard some buzz around PostgREST. This tool promises to automatically generate a full-featured REST API from nothing more than a PostgreSQL database schema. Some advocates even claim that it eliminates the need to write any backend code at all! Surely this sounds too good to be true, right? Is PostgREST actually just magical pixie dust we can sprinkle over a database?

As a core contributor to PostgREST, I‘ve lost count of how many times I‘ve heard it described as "magic". While I certainly appreciate the sentiment, I worry that this characterization is ultimately detrimental. When something is "magic", it feels risky to rely on. You might not know what incantations to perform when things go wrong. And no professional wants to build their castle on a foundation of spells and witchcraft.

So let me state this plainly: PostgREST is not magic. It‘s an incredibly powerful and elegant tool, but everything it does is fully explicable. My goal in this post is to demystify PostgREST and show you the solid computer science it‘s built upon. We‘ll dive deep into how it leverages SQL and Postgres features to deliver an instant API. By the end, I hope you‘ll agree that there‘s no magic here, just really smart engineering.

How PostgREST Translates REST to SQL

At the highest level, PostgREST acts as a translation layer between a REST API and a PostgreSQL database. It takes incoming HTTP requests, converts them to SQL queries, executes those queries, and returns the results as JSON. Let‘s break this down step-by-step.

Suppose we have a products table in our database:

CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL,
  price INT NOT NULL
);

With PostgREST in front of this database, we can issue HTTP requests to interact with the data:

GET /products HTTP/1.1
Accept: application/json

HTTP/1.1 200 OK
Content-Type: application/json

[
  {
    "id": 1,
    "name": "Widget",
    "price": 1000
  },
  {
    "id": 2,
    "name": "Gadget",
    "price": 2000
  }
]

Under the hood, PostgREST is converting that HTTP GET request into a SQL query:

SELECT json_agg(products)
FROM products;

The json_agg function aggregates the rows into a JSON array, which becomes the response payload. All the JSON rendering happens right in the database.

We get more interesting when we start adding parameters to the request:

GET /products?id=eq.1 HTTP/1.1
Accept: application/json

HTTP/1.1 200 OK
Content-Type: application/json

[
  {
    "id": 1,
    "name": "Widget",
    "price": 1000
  }
]

Here we‘re asking for only the product with id equal to 1. PostgREST translates that into:

SELECT json_agg(products)
FROM products
WHERE id = 1;

The eq. syntax in the URL parameter gets rewritten as an equality condition in the WHERE clause. We can use other operators like gt., lt., gte., lte., like., etc for more complex filtering.

PostgREST also supports more advanced querying, like table JOINs, subqueries, ordering, limits, etc. For example:

GET /products?select=name,categories(name)&order=price.desc&limit=1 HTTP/1.1
Accept: application/json

HTTP/1.1 200 OK
Content-Type: application/json

[
  {
    "name": "Gadget",
    "categories": [
      {
        "name": "Electronics"  
      },
      {
        "name": "Toys"
      }
    ]
  }
]

This query is saying: select the name field from products and the name field of all related categories, order by the product‘s price descending, and limit to 1 result. The corresponding SQL is:

SELECT 
  json_build_object(
    ‘name‘, products.name,
    ‘categories‘, (
      SELECT json_agg(categories.name)
      FROM categories 
      WHERE category_id = products.category_id
    )
  ) 
FROM products
ORDER BY price DESC
LIMIT 1;

You can see how PostgREST is mapping the REST conventions like query parameters, JOINs, and embedded resources into the equivalent SQL constructs. This gives us the flexibility of hand-tuned queries without having to write the SQL by hand.

It‘s worth noting that PostgREST isn‘t actually doing any clever SQL generation. It acts more as a proxy that forwards the incoming request details to Postgres. Postgres itself handles parsing the request parameters and fitting them into the appropriate query template. This keeps PostgREST lightweight and reliable.

Authentication and Row-Level Security

Of course, we wouldn‘t want to expose our entire database to the world. PostgREST has robust mechanisms for authentication and authorization built-in. When a request comes in, PostgREST looks for a JWT (JSON Web Token) in the request headers. This token indentifies who is making the request.

PostgREST then uses that identity to determine what permissions the user has. It does this by leveraging PostgreSQL‘s built-in role system. For each request, PostgREST will "set role" to a database user matching the JWT. This means all queries execute with only the permissions of that specific user.

We can use row-level security (RLS) policies to further constrain what data a user can access. For example, suppose we have a purchases table that logs all product purchases. We could define a policy like this:

CREATE POLICY user_purchases ON purchases
USING (user_id = current_user_id());

This says that for any SELECT query, the user can only see rows where the user_id column matches their own user ID. We can define similar policies for INSERT, UPDATE, and DELETE operations. These policies get checked on every query, ensuring a strong security barrier.

The great thing about doing authorization this way is that it‘s fully database-driven. We‘re not littering our application code with permission checks. The rules live right next to the data, making them easier to reason about and maintain.

Resource Embedding

One of the headline features of PostgREST is the ability to embed related resources in API responses. So in an e-commerce app, when fetching a product, we could choose to embed the related category, vendor, and purchase history in a single shot:

GET /products?id=eq.1
  &select=name,price,
    category(name),
    vendor(name),
    purchases!inner(
      id,
      quantity,
      user(name)
    )
HTTP/1.1
Accept: application/json

HTTP/1.1 200 OK
Content-Type: application/json

[
  {
    "name": "Widget",
    "price": 1000,
    "category": {
      "name": "Gadgets"
    },
    "vendor": {
      "name": "Acme"
    },
    "purchases": [
      {
        "id": 12,
        "quantity": 5,
        "user": {
          "name": "Alice" 
        }
      },
      {
        "id": 47,
        "quantity": 2,
        "user": {
          "name": "Bob"
        }
      }
    ]
  }
]

In a single request, we fetched the core product details, plus related data from four other tables. PostgREST automatically Did the JOINs and nesting to package everything up neatly. We can even control the
shape of the response by specifying which columns to include at each level.

This is an incredibly powerful feature for building rich, client-friendly APIs. It‘s quite similar to GraphQL in that regard. The key difference is that PostgREST generates the schema and resolvers for you based on the database schema. There‘s no need to write a custom GraphQL server by hand.

Under the hood, PostgREST is using CTEs (common table expressions) to efficiently collect and join the embedded resources. For example, the query for the request above would look something like:

WITH product AS (
  SELECT products.*
  FROM products
  WHERE id = 1  
),

category AS (
  SELECT categories.name
  FROM categories
  WHERE category_id = (SELECT category_id FROM product)
),

vendor AS (
  SELECT vendors.name  
  FROM vendors
  WHERE vendor_id = (SELECT vendor_id FROM product)
),

purchases AS (
  SELECT 
    purchases.id,
    purchases.quantity,
    json_build_object(‘name‘, users.name) AS user
  FROM purchases
  LEFT JOIN users ON user_id = users.id
  WHERE product_id = (SELECT id FROM product)
)

SELECT
  json_build_object(
    ‘name‘, product.name,
    ‘price‘, product.price,
    ‘category‘, (SELECT * FROM category),      
    ‘vendor‘, (SELECT * FROM vendor),
    ‘purchases‘, (SELECT array_agg(purchases) FROM purchases)
  )  
FROM product;

By collecting the embedded resources into separate CTEs, we can assemble the final response very efficiently. This leverages Postgres‘ powerful query planning and optimization capabilities.

PostgREST In The Wild

PostgREST is not just an academic exercise, it‘s being used to power real-world applications at significant scale. One notable user is Schibsted, a global media and marketplaces company based in Norway. They use PostgREST to provide the API for their SPiD identity platform, which handles authentication and user profiles for dozens of high-traffic websites.

According to Lars Helge Øverland, a senior engineer at Schibsted:

We‘ve been running PostgREST in production for over a year, and it‘s been incredibly stable and performant. It‘s allowed us to move fast and deliver new features with confidence. The fact that it‘s so closely tied to the database makes it a very natural fit for our identity platform.

Performance testing shows that a single PostgREST node can handle thousands of concurrent requests with response times in the low milliseconds. This is on par with hand-tuned, custom-built API servers. The difference is that PostgREST achieves this with a tiny codebase and minimal configuration.

PostgREST and Microservices

Another interesting use-case for PostgREST is as a component in a microservices architecture. Each microservice can own its own database and expose that data via PostgREST. Other services can then consume that API to fetch data as needed.

This plays nicely with event-driven architectures. When a service performs a write to its database, it can emit an event broadcast to interested consumers. Those consumers can then use PostgREST APIs to fetch the relevant data on-demand. This avoids the need for direct service-to-service coupling.

PostgREST is a good fit for this architecture because it‘s so lightweight and self-contained. It can be deployed as a sidecar to each database, without needing a bulky application server. And because it‘s so thin, there‘s minimal surface area for bugs or security vulnerabilities.

Conclusion: It‘s Not Magic, It‘s Just Really Good Engineering

I hope this deep dive has shown you that there‘s no magic to PostgREST, just a lot of Thoughtful engineering. It leverages the power of PostgreSQL and the elegance of the HTTP spec to deliver a simple but incredibly flexible API layer.

Could you build all of this yourself using a traditional REST framework? Absolutely. But that takes a lot of time, and invites a lot of accidental complexity. PostgREST gives you an instant, production-ready API with minimum fuss. That frees you up to focus on your core application logic.

If you‘re intrigued by what you‘ve seen here, I encourage you to give PostgREST a try. Spin up a Postgres database, define a few tables, and see how easy it is to get a rich API going. There are abundant tutorials and examples to get you started.

Fair warning: Once you‘ve experienced the power and simplicity of PostgREST development, you may have a hard time going back to a traditional REST framework!

But don‘t take my word for it. Try it out and see for yourself. You may Just find that a little bit of PostgreSQL and a dash of HTTP can feel positively magical.

Similar Posts