How to Use Window Functions in SQL – with Example Queries

Window functions are a powerful feature in SQL that allow you to perform calculations across sets of rows that are related to the current row. They provide a way to compute moving averages, running totals, rankings, and other analytics without the need for complex subqueries or self-joins. In this guide, we‘ll dive deep into how window functions work and walk through several examples of how to use them to solve real-world problems.

Understanding Window Functions

A window function performs an operation on a "window" or set of rows that are somehow related to the current row. But unlike regular aggregate functions like SUM() and AVG() which return a single value per group, window functions return a value for each row in the result set. This allows you to calculate running totals, compare values to an average, rank results, and more.

Window functions are used with the OVER clause, which defines the window of rows to include in the calculation. You can think of the OVER clause as a window into the result set that shifts and resizes as the current row changes. The OVER clause can take three optional parameters:

PARTITION BY – Divides the result set into partitions, each of which is operated on separately
ORDER BY – Sorts the rows within each partition
Frame clause – Explicitly defines which rows to include in the frame for each calculation (we‘ll cover this more later)

Types of Window Functions

There are three main types of window functions:

Aggregate Functions – Perform an aggregation across a set of rows and return a single value for each row in the result set. Examples include SUM(), AVG(), COUNT(), MAX(), MIN().
Ranking Functions – Assign an integer rank to each row within a partition based on its position. Examples include ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE().
Value Functions – Provide access to values from other rows in the window without the need for a self-join. Examples include LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE().

Let‘s look at some examples of each type.

Aggregate Window Function Examples

Suppose we have a sales table with columns for product, sales rep, amount, and date:

product	rep	amount	date
Widget	John	100	1/1/23
Widget	John	200	1/2/23
Gadget	Jane	150	1/3/23
Widget	John	300	1/4/23
Gadget	Jane	250	1/5/23

To calculate the running total of sales for each rep, we can use the SUM() function with OVER and PARTITION BY:

SELECT 
  product,
  rep, 
  amount,
  date,
  SUM(amount) OVER (
    PARTITION BY rep
    ORDER BY date
  ) AS running_total  
FROM sales;

This will produce:

product	rep	amount	date	running_total
Widget	John	100	1/1/23	100
Widget	John	200	1/2/23	300
Widget	John	300	1/4/23	600
Gadget	Jane	150	1/3/23	150
Gadget	Jane	250	1/5/23	400

The running total restarts at each new rep because we partitioned by the rep column. The totals are also calculated in date order within each partition due to the ORDER BY clause.

We could also compare each sale to the rep‘s average:

SELECT
  product, 
  rep,
  amount,
  date, 
  amount - AVG(amount) OVER (PARTITION BY rep) AS vs_avg
FROM sales;

Result:

product	rep	amount	date	vs_avg
Widget	John	100	1/1/23	-100
Widget	John	200	1/2/23	0
Widget	John	300	1/4/23	100
Gadget	Jane	150	1/3/23	-50
Gadget	Jane	250	1/5/23	50

Ranking Function Examples

Ranking functions assign integer values to each row based on its position within the partition. To rank sales reps by their total sales amounts:

SELECT
  rep,
  SUM(amount) AS total_sales,
  RANK() OVER (ORDER BY SUM(amount) DESC) AS sales_rank
FROM sales
GROUP BY rep;

This produces:

rep	total_sales	sales_rank
John	600	1
Jane	400	2

The RANK() function handles ties by giving them the same rank and skipping the next value. So if John and Jane had the same total sales, they would both get rank 1 and the next rank would be 3.

In contrast, DENSE_RANK() does not skip values after ties. ROW_NUMBER() is similar but always assigns unique numbers, even for ties.

Value Function Examples

Value functions let you easily reference values from other rows in the window. For example, to compare each sales amount to the previous and next amounts for that rep:

SELECT
  product,
  rep, 
  amount,
  date,
  LAG(amount, 1) OVER (
    PARTITION BY rep 
    ORDER BY date
  ) AS prev_amount,
  LEAD(amount, 1) OVER (
    PARTITION BY rep
    ORDER BY date  
  ) AS next_amount
FROM sales;

Result:

product	rep	amount	date	prev_amount	next_amount
Widget	John	100	1/1/23	NULL	200
Widget	John	200	1/2/23	100	300
Widget	John	300	1/4/23	200	NULL
Gadget	Jane	150	1/3/23	NULL	250
Gadget	Jane	250	1/5/23	150	NULL

LAG looks back to previous rows and LEAD looks ahead to later rows. Since there is no previous value for the first row and no next value for the last row in each partition, those return NULL.

Other value functions like FIRST_VALUE() and LAST_VALUE() return the first or last value in the window frame. These are handy for things like finding each employee‘s hire date or when they received their latest raise.

Defining Window Frames

By default, the window frame includes all rows in the partition from the first to the current row. But we can explicitly control which rows are included using the frame clause.

The frame clause comes after PARTITION BY and ORDER BY and specifies a start and end boundary:

OVER (
  PARTITION BY ...
  ORDER BY ...
  {ROWS | RANGE} BETWEEN <start> AND <end>  
)

\<start> and \<end> can be one of:

UNBOUNDED PRECEDING – From the first row of the partition
UNBOUNDED FOLLOWING – To the last row of the partition
CURRENT ROW – The current row
\<integer> PRECEDING – Number of rows before the current row
\<integer> FOLLOWING – Number of rows after the current row

For example, to calculate a 3-day moving average of sales:

SELECT
  rep,
  date,  
  amount,
  AVG(amount) OVER (
    PARTITION BY rep
    ORDER BY date
    ROWS BETWEEN 2 PRECEDING AND CURRENT ROW  
  ) AS moving_avg
FROM sales;

Result:

rep	date	amount	moving_avg
John	1/1/23	100	100.0000
John	1/2/23	200	150.0000
John	1/4/23	300	200.0000
Jane	1/3/23	150	150.0000
Jane	1/5/23	250	200.0000

The moving average is calculated over the current row and 2 previous rows for each rep. Note that the first row for John has no previous rows, so only the current row amount is used. The second row uses the current and 1 preceding, while the third uses the full 3-row frame. This pattern then repeats for Jane in a new partition.

The RANGE keyword is similar but defines the frame in terms of a value range of the ORDER BY column instead of a fixed number of rows. This is useful for things like bucketing values into discrete numeric ranges.

Benefits of Window Functions

Window functions provide several key benefits over using standard aggregations or self joins to calculate analytics:

Simplified queries – Window functions allow you to calculate complex analytics in a single query without the need for joining the table to itself or nesting subqueries. This makes the code easier to write, read, and maintain.
Better performance – Because the window calculations are applied after the main query‘s joins, filters, and aggregations, the optimizer often processes them more efficiently than a self-join approach. The database also does not need to create large intermediate result sets.
Enhanced functionality – Window functions can do things that would be very difficult with standard SQL, like ranking rows, accessing values from other rows, and calculating moving averages. They provide a concise syntax for these use cases.

Conclusion

Window functions are a powerful tool for calculating analytics and generating insights from your data. By understanding how the OVER clause, PARTITION BY, ORDER BY, and frame boundaries work, you can begin leveraging them to write simpler, faster, and more capable SQL queries.

The key is to think about your data in terms of windows – which rows do you want to perform a calculation over for each row? Partitioning, ordering, and framing provide flexible ways to define those windows based on your analysis goals. I encourage you to practice using window functions in your own projects and explore the other functions I didn‘t cover here, like PERCENTILE_CONT() and CUME_DIST(). The more you use them, the more opportunities you will find to replace complex, multi-step logic with elegant window queries!

How to Use Window Functions in SQL – with Example Queries

Understanding Window Functions

Types of Window Functions

Aggregate Window Function Examples

Ranking Function Examples

Value Function Examples

Defining Window Frames

Benefits of Window Functions

Conclusion

Related

The SQL Handbook: A Comprehensive Guide for Web Developers

SQL Insert Into and Insert Statements: A Comprehensive Guide with MySQL Examples

ROW_NUMBER in SQL: Mastering the Top Examples in SQL Server

SQL CREATE TABLE Explained with Syntax Examples for MySQL and Postgres

The SQL LIKE Operator: A Comprehensive Guide with Example Syntax

SQL AVG – The SQL Average Function Explained with Syntax Examples

Understanding Window Functions

Types of Window Functions

Aggregate Window Function Examples

Ranking Function Examples

Value Function Examples

Defining Window Frames

Benefits of Window Functions

Conclusion

Related

Similar Posts