The NGINX Handbook: Learn NGINX Step-by-Step

NGINX has risen to become one of the most popular web servers and reverse proxy servers in the world. Known for its high performance, stability, rich feature set, and simple configuration, NGINX now powers more than 480 million websites globally.

In this in-depth guide, we‘ll introduce you to NGINX and walk through how to use it effectively as a web server and reverse proxy. You‘ll learn how to install NGINX, configure virtual hosts, use NGINX for load balancing and caching, optimize performance, and much more. We‘ll share configuration snippets, best practices, and useful resources along the way.

By the end of this handbook, you‘ll have a strong understanding of NGINX and will be well-equipped to use it for your own projects. Let‘s get started!

NGINX Overview

NGINX (pronounced "engine-x") is open source software for serving web content, proxying requests, load balancing, and more. Some of the key features and use cases for NGINX include:

  • Serving static and dynamic web content
  • Acting as a reverse proxy for web servers and application servers
  • Load balancing traffic across multiple backend servers
  • Caching content to reduce load on backends
  • Enabling SSL/HTTPS and HTTP/2

History and Growth

NGINX was originally created by Igor Sysoev, a Russian engineer who began developing it in 2002 to solve the problem of handling high levels of concurrency—known as the C10K problem.

He aimed to create an architecture that could handle 10,000 simultaneous connections efficiently, something that popular servers like Apache HTTPd struggled with at the time due to their process-per-connection model. Igor‘s solution was an asynchronous, event-driven approach that would form the foundation of NGINX.

After two years of development, NGINX was publicly released in 2004 under a 2-clause BSD license. It quickly gained adoption and by 2008 was already serving 7.5% of the world‘s top 1000 websites. In 2011, the company NGINX, Inc. was founded to provide commercial support and an enterprise version called NGINX Plus.

Over the following years, NGINX continued to grow in popularity. It became the default web server for many Linux distributions and was widely adopted by high-traffic websites. By 2019, NGINX was serving more than 60% of the 10,000 most popular websites. In March 2019, the company was acquired by F5 Networks for $670 million.

Today, NGINX is used by companies like Netflix, Airbnb, Dropbox, and many more to power their web infrastructure. It has been instrumental in the rise of microservice architectures and cloud computing.

Popularity and Usage

Just how popular is NGINX today? According to the April 2021 Web Server Survey by Netcraft, NGINX is the 2nd most popular web server behind Apache, powering 23.3% of all websites. However, it powers a larger percentage of the world‘s highest traffic sites, with usage growing as the site rank increases.

W3Techs reports that NGINX has a 35.6% market share among the top 1 million websites and 48.7% among the top 1000. For the top 10,000 sites, it‘s an even split between NGINX and Apache.

Here are some other statistics that illustrate NGINX‘s growth and popularity:

  • In 2010, only about 15,000 websites used NGINX. By 2017 it was over 450,000.
  • NGINX serves over 29% of the top 10,000 websites in the US, ahead of Apache at 26%.
  • The number of websites using NGINX has grown over 10x in the last 5 years.

One of the reasons for NGINX‘s widespread adoption is its versatility. Some common use cases include:

  • Web serving for sites built with WordPress, Drupal, Magento, and other CMSes
  • Frontend for Apache, while serving static content directly
  • Load balancing in front of application servers like Node.js, Java, and PHP
  • Serving as an API gateway or application delivery controller
  • TLS/SSL termination for improved performance and security
  • Caching both static and dynamic content
  • Streaming media through HTTP, HLS, HDS, and other protocols

NGINX Architecture

To understand how NGINX is able to handle high concurrency so effectively, let‘s look at its underlying architecture.

Event-Driven Design

The fundamental principle behind NGINX‘s design is an event-driven, asynchronous (non-blocking) approach. This means that instead of creating new processes or threads for each request, NGINX uses a single main process that efficiently manages worker processes, which in turn handle requests.

When an NGINX worker accepts a new connection, it adds it to an event loop where it is processed asynchronously. Within the loop, the worker multiplexes and processes the connections as they become ready for reading, writing, or processing. This allows the worker to continuously handle many concurrent requests without blocking on any single connection.

In contrast, traditional servers like Apache create a new process or thread for each connection (or use a fixed-size thread pool). While this simplifies the processing model, it comes at the cost of increased memory usage and CPU overhead from the constant creation and destruction of processes.

NGINX‘s event-driven model powered by modern Linux kernel mechanisms like epoll allows it to scale to hundreds of thousands of concurrent connections on a single server. Each new connection only requires a small additional memory footprint (~2KB) within the worker process.

Master and Worker Processes

When you start NGINX, it launches a single master process and a configured number of worker processes (usually one per CPU core). The master process is responsible for reading configuration, binding ports, and spawning the workers. It also handles graceful reloads and restarts of the worker processes.

The worker processes do the actual work of handling connections and serving content. They listen for and accept new connections, add them to the event loop, and process the requests before sending responses back to the clients. The worker processes run independently of each other and do not share state, which allows NGINX to scale horizontally by adding more worker processes or servers behind a load balancer.

Connection Processing

Let‘s walk through how an NGINX worker handles an incoming connection:

  1. The worker‘s event loop uses an efficient polling mechanism (epoll on Linux) to monitor for new events on the listen sockets and client connections
  2. When a new connection is received, the worker adds it to the event loop and waits for it to be ready for reading
  3. Once the connection is ready, the worker reads the client request and parses the headers
  4. Based on the request headers and URI, the worker determines how to map the request to a virtual server and location block within the NGINX configuration
  5. If the request maps to a static file, the worker reads it from disk and sends the response back to the client
  6. If the request requires processing by a proxied server, FastCGI application, or other handler, the worker opens a new connection and forwards the request
  7. When the proxied server or application returns a response, the worker reads it and forwards it to the client
  8. If the response can be cached, the worker may save it in a cache for faster serving of subsequent requests
  9. Once the full response has been sent, the worker closes the client connection and removes it from the event loop (unless keepalive is enabled)

Throughout this flow, the worker process never blocks on network I/O, disk access, or processing. Slow operations are delegated asynchronously, allowing the worker to continuously handle other connections concurrently. This is the key to NGINX‘s high performance and scalability.

Concurrency Comparison

To put this in perspective, let‘s look at how NGINX performs under load compared to Apache. A benchmark by DigitalOcean found that for a static web page, NGINX was able to serve nearly 10,000 requests per second with an average CPU utilization of ~10%. Apache‘s event MPM (which uses a hybrid process-thread model) maxed out at ~6,000 requests per second and 85% CPU utilization.

For dynamic requests proxied to PHP-FPM, NGINX still achieved ~5,000 requests per second at ~65% CPU usage. Apache handled a respectable ~3,600 requests per second but was pushing 95% CPU.

These results highlight how NGINX‘s event-driven architecture allows it to handle more concurrent requests with less strain on system resources. The gap only widens as concurrency increases.

Of course, performance isn‘t everything—Apache still has advantages in terms of compatibility and dynamic module loading. But for raw speed and scalability, NGINX is hard to beat.

How to Learn More

We‘ve covered a lot of ground in this handbook, but there‘s still much more to learn about NGINX. Here are some recommended resources to continue your journey:

Official NGINX Documentation

The NGINX documentation is the authoritative resource for learning about NGINX‘s configuration directives, modules, and core functionality. It provides a comprehensive reference, admin‘s guide, and wiki.

Some key documents to read:

The documentation can be dense at times, but it‘s the best way to gain a deep understanding of NGINX‘s capabilities.

NGINX Repo on GitHub

The NGINX source code is available on GitHub for anyone to view and contribute to. Reading through the code can give you insights into how NGINX works under the hood. You can also follow issues and discussions to stay up-to-date on the latest developments.

NGINX Mailing List and Forum

The NGINX mailing list is a good place to ask questions, get help with configuration issues, and connect with other NGINX users. There is also an official NGINX forum for extended discussions.

Community and Expert Blogs

There are many community-written tutorials, guides, and opinion pieces on NGINX around the web. Some reputable resources include:

Many individual developers and sysadmins also write about their experiences with NGINX on personal blogs. While the quality can vary, you can often find useful nuggets of information.

Some expert blogs I recommend:

Following these resources can help you continue learning and stay current with the ever-evolving landscape of NGINX.

Conclusion

In this handbook, we‘ve taken a comprehensive look at NGINX and how to use it effectively as a web server and reverse proxy. We covered NGINX‘s history, its event-driven architecture, basic installation and configuration, using it for load balancing and caching, performance optimization, and much more.

While there‘s still more to learn, you should now have a strong foundation for working with NGINX. You understand what makes it so performant, how to configure virtual hosts, how to optimize for concurrency, and where to find additional help and resources.

NGINX has become an essential tool for anyone working on modern web infrastructure. Whether you‘re a web developer, system administrator, or DevOps engineer, a deep knowledge of NGINX will serve you well.

So continue your learning journey—dive into the documentation, read discussions on the mailing list, follow expert blogs, and experiment with your own configurations. Most importantly, share your knowledge and contribute back to the amazing open-source community that has made NGINX the success it is today.

Here‘s to serving the web, one event at a time!

Similar Posts