NGINX Rate Limiting in a Nutshell

As a full-stack developer, you know the importance of protecting your web applications and APIs from being overwhelmed by traffic spikes or malicious request floods. One powerful tool in your arsenal is rate limiting – the ability to control the rate of incoming requests to prevent your systems from being inundated. And when it comes to implementing rate limiting, NGINX is the gold standard.

In this expert-level guide, we‘ll take a deep dive into NGINX‘s rate limiting capabilities. We‘ll explore the key NGINX directives and configuration options you need to know, break down how NGINX decides to accept or reject requests based on your rate limit rules, and walk through detailed examples to solidify your understanding. By the end, you‘ll have the knowledge to implement effective rate limiting for your own applications using NGINX.

Rate Limiting 101

At its core, rate limiting is about controlling the flow of incoming requests to protect your backend systems and ensure fair usage of resources. By enforcing a maximum allowed rate of requests, typically over a sliding time window, you can:

Prevent excessive requests from overwhelming your servers
Ensure high availability and protect against DDoS attacks
Enforce usage quotas and prevent API abuse
Encourage efficient usage patterns and avoid unnecessary requests

Rate limiting is especially crucial for publicly-accessible web applications and APIs where you may have less control over incoming traffic patterns. It acts as a vital gatekeeper, ensuring your systems can handle the load and maintain responsiveness for all users.

NGINX: A Rate Limiting Powerhouse

When it comes to implementing rate limiting, NGINX is the tool of choice for many developers and DevOps teams. NGINX is a high-performance web server, reverse proxy, and load balancer known for its robustness, efficiency, and configurability.

One of NGINX‘s key features is its built-in support for rate limiting via the ngx_http_limit_req_module. This module provides a set of powerful directives that allow you to define and enforce rate limits flexibly and efficiently.

Some key benefits of using NGINX for rate limiting:

High performance and low overhead
Ability to rate limit by various criteria (e.g. request URL, client IP)
Fine-grained control over rate limits and bursting behavior
Efficient utilization of shared memory for storing rate limit counters
Seamless integration with other NGINX features like caching and load balancing

With NGINX in your toolkit, you have a battle-tested solution for protecting your applications with robust rate limiting.

NGINX Rate Limiting Directives

At the heart of NGINX rate limiting are two key directives:

limit_req_zone – Defines a shared memory zone for storing rate limit counters
limit_req – Applies rate limits to specific URL locations

Let‘s take a closer look at each one.

limit_req_zone

The limit_req_zone directive is used to define a shared memory zone for storing rate limit counters. This is where you specify the key to rate limit on (e.g. client IP, request URL) and the maximum allowed rate.

Here‘s an example:

limit_req_zone $binary_remote_addr zone=myzone:10m rate=1r/s;

This creates a shared memory zone named "myzone" that uses the client IP address ($binary_remote_addr) as the key. The size of the zone is set to 10 megabytes (10m) and the maximum allowed rate is 1 request per second (1r/s).

limit_req

Once you have a shared memory zone defined, you can apply rate limits to specific URL locations using the limit_req directive.

For example:

location /api {
    limit_req zone=myzone;
    # ...
}

This tells NGINX to apply the rate limit defined by the "myzone" shared memory zone to the /api location. Requests exceeding the limit will be rejected with a 503 error.

Burst and Nodelay

The limit_req directive also supports two optional parameters that give you more control over bursting behavior:

burst – Specifies the maximum number of requests allowed to exceed the rate limit. Excess requests will be delayed until they can be processed without exceeding the limit.
nodelay – When used with burst, NGINX will process excess requests immediately (as long as there are burst slots available) instead of spacing them out over time.

Here‘s an example using both:

location /api {
    limit_req zone=myzone burst=10 nodelay;
    # ...
}

With this configuration, NGINX will allow up to 10 requests to burst above the base rate limit. Those 10 requests will be processed immediately as they come in (thanks to nodelay). Additional requests will be rejected with a 503 error until the burst allowance resets.

Rate Limiting in Action

Now that we‘ve covered the key directives, let‘s look at some examples of NGINX rate limiting in action.

Basic Rate Limiting

First, a basic rate limiting example:

http {
    limit_req_zone $binary_remote_addr zone=myzone:10m rate=1r/s;

    server {
        location /api {
            limit_req zone=myzone;
            # ...
        }
    }
}

Here‘s what happens when traffic hits the /api endpoint:

NGINX checks the request rate for each unique client IP
If a client exceeds 1 request per second, additional requests are rejected with a 503 error
Rejected requests do not count against the client‘s rate limit counter

This is a straightforward way to enforce a global rate limit on an API. Each client can only make 1 request per second, period.

Allowing Bursts

In many cases, you may want to allow clients to occasionally burst above the rate limit, as long as they don‘t sustain that higher rate indefinitely. That‘s where the burst parameter comes in:

http {
    limit_req_zone $binary_remote_addr zone=myzone:10m rate=1r/s;

    server {
        location /api {
            limit_req zone=myzone burst=5;
            # ...
        }
    }
}

Now, each client can "save up" burst allowance if they make fewer than 1 request per second. They‘re allowed to make up to 5 requests in a quick burst, as long as their overall request rate averages out to 1r/s over time.

Here‘s an example request pattern:

Second 1: Client makes 3 requests – NGINX allows all 3 (burst allowance = 2)
Second 2: Client makes 3 more requests – NGINX allows 3 (burst allowance = 0)
Second 3: Client makes 2 requests – NGINX allows 1 (to maintain 1r/s average), rejects 1 with 503 error, burst allowance = 0
Second 4: Client makes no requests – Burst allowance resets to 1
Second 5: Client makes 2 requests – NGINX allows both (burst allowance = 0)

As you can see, bursting allows clients to exceed the rate limit occasionally as long as their average rate remains at or below the limit over time.

Burst with No Delay

By default, when NGINX allows requests to burst using the burst parameter, it spaces out those requests to maintain the defined rate limit. So if you have a rate of 1r/s and a burst of 5, NGINX will allow 5 requests to come in a burst but then delay processing them to ensure the 1r/s rate is maintained.

If you want NGINX to allow bursts without adding delay, you can add the nodelay parameter:

http {
    limit_req_zone $binary_remote_addr zone=myzone:10m rate=1r/s;

    server {
        location /api {
            limit_req zone=myzone burst=5 nodelay;
            # ...
        }    
    }
}

With this configuration, when a client exceeds the 1r/s limit but has burst allowance available, NGINX will process those extra requests immediately instead of spacing them out. This sacrifices some smoothness in the output rate in order to improve responsiveness for clients.

Here‘s that same example request pattern with nodelay:

Second 1: Client makes 3 requests – NGINX allows all 3 immediately (burst allowance = 2)
Second 2: Client makes 3 more requests – NGINX allows all 3 immediately (burst allowance = 0)
Second 3: Client makes 2 requests – NGINX allows 1 (to maintain 1r/s average), rejects 1 with 503 error, burst allowance = 0
Second 4: Client makes no requests – Burst allowance resets to 1
Second 5: Client makes 2 requests – NGINX allows both immediately (burst allowance = 0)

The key difference is that excess requests within the burst allowance are processed immediately instead of being spaced out. This can be useful for keeping response times low for bursty traffic, as long as your backend services can handle the extra requests.

Choosing a Rate Limiting Strategy

With an understanding of NGINX‘s rate limiting directives in hand, let‘s discuss some common use cases and strategies.

Rate Limiting by URL

Rate limiting by URL is useful when you want to protect a specific resource or endpoint. For example, you might have an API endpoint for creating new user accounts that you want to rate limit more aggressively than your other API routes to prevent abuse.

To rate limit by URL, use the $request_uri variable as the key in your limit_req_zone directive:

limit_req_zone $request_uri zone=createuser:10m rate=1r/m;

server {
    location /api/createuser {
        limit_req zone=createuser;
        # ...
    }

    location /api {
        # No rate limit
        # ...
    }
}

This configuration applies a rate limit of 1 request per minute to the /api/createuser endpoint while leaving other API routes unlimited.

Rate Limiting by IP

Rate limiting by client IP is a good way to prevent individual clients from overwhelming your services. It ensures a more equitable distribution of resources across all clients.

To rate limit by IP, use the $binary_remote_addr variable as the key in your limit_req_zone directive:

limit_req_zone $binary_remote_addr zone=perip:10m rate=10r/s;

server {
    location /api {
        limit_req zone=perip;
        # ...
    }
}

This configuration limits each unique client IP to 10 requests per second across all /api routes.

Burst vs No Burst

Choosing whether to allow bursts depends on your traffic patterns and backend services.

Allowing bursts can improve perceived performance for clients with "bursty" traffic patterns – think a client making several quick requests in succession followed by a lull. Bursting prevents those clients from being unnecessarily throttled as long as their overall request rate stays within the limit.

On the flip side, a sustained burst of traffic that‘s allowed through can put more strain on your backend services. If your services can‘t handle the extra requests, it may be better to enforce a smooth rate with no bursting.

As a general rule of thumb, allowing small bursts (e.g. 5-10 extra requests) is a good way to balance performance and protection. But the right values depend on your specific services and traffic patterns.

Testing Rate Limits

It‘s always a good idea to test your rate limiting configurations to verify they‘re working as expected. One easy way to do this is with a tool like Apache Bench or Siege to send test requests and check the responses.

For a more hands-on approach, I‘ve created a Docker image with an NGINX server preconfigured with various rate limiting examples. You can use this to experiment with different configurations and see the results firsthand.

Check out the GitHub repo for the Dockerfile and configuration files. You can also pull the ready-to-use Docker image:

docker pull yourusername/nginx-ratelimit-examples

Spin up the Docker container and send some test requests to see rate limiting in action!

In Conclusion

NGINX is a powerful tool for implementing rate limiting and protecting your web applications and APIs. Its rich set of rate limiting directives give you fine-grained control over how and where you enforce rate limits.

In this guide, we covered:

The key NGINX rate limiting directives: limit_req_zone and limit_req
How burst and nodelay parameters affect rate limiting behavior
Examples of rate limiting in action, from basic limits to advanced bursting
Strategies for choosing a rate limiting approach based on your use case
How to test your rate limiting configurations hands-on with Docker

With these tools and techniques in your back pocket, you‘re well-equipped to implement robust rate limiting for your own applications. Now go forth and build with confidence!

As always, if you have any questions or feedback, feel free to reach out. Happy rate limiting!

NGINX Rate Limiting in a Nutshell

Rate Limiting 101

NGINX: A Rate Limiting Powerhouse

NGINX Rate Limiting Directives

limit_req_zone

limit_req

Burst and Nodelay

Rate Limiting in Action

Basic Rate Limiting

Allowing Bursts

Burst with No Delay

Choosing a Rate Limiting Strategy

Rate Limiting by URL

Rate Limiting by IP

Burst vs No Burst

Testing Rate Limits

In Conclusion

Related

Deploying Django to Production on AWS with NGINX, uWSGI, and PostgreSQL

Unlocking the Power of A/B Testing with NGINX: A Developer‘s Guide

Hosting Multiple Websites on a Single Server: A Comprehensive Guide

The NGINX Handbook: Learn NGINX Step-by-Step

How to Set Up an Easy and Secure Reverse Proxy with Docker, Nginx & Let‘s Encrypt: A Comprehensive Guide

How to Create a High-Performance VueJS PWA on a Secure NGINX Server

Rate Limiting 101

NGINX: A Rate Limiting Powerhouse

NGINX Rate Limiting Directives

limit_req_zone

limit_req

Burst and Nodelay

Rate Limiting in Action

Basic Rate Limiting

Allowing Bursts

Burst with No Delay

Choosing a Rate Limiting Strategy

Rate Limiting by URL

Rate Limiting by IP

Burst vs No Burst

Testing Rate Limits

In Conclusion

Related

Similar Posts