Window Functions

Window functions perform calculations across a set of table rows that are related to the current row. Unlike regular aggregate functions, window functions do not cause rows to become grouped into a single output row - the rows retain their separate identities.

Architecture Overview

graph TB
    subgraph "Entry Points"
        Query["execute_select<br/>[query.rs:155]"]
        HasWindow["has_window_functions<br/>[window.rs:2522]"]
    end
    
    subgraph "Execution Coordinators"
        ExecWindow["execute_select_with_window_functions<br/>[window.rs:115]"]
        ExecPresorted["execute_select_with_window_functions_presorted<br/>[window.rs:135]"]
        ExecPregrouped["execute_select_with_window_functions_pregrouped<br/>[window.rs:156]"]
        ExecStreaming["execute_select_with_window_functions_streaming<br/>[window.rs:422]"]
        ExecLazy["execute_select_with_window_functions_lazy_partition<br/>[window.rs:591]"]
    end
    
    subgraph "Core Computation"
        ParseWF["parse_window_functions<br/>[window.rs:974]"]
        ComputeWF["compute_window_function<br/>[window.rs:1216]"]
        ComputePartition["compute_window_for_partition<br/>[window.rs:1423]"]
        ComputeAggWF["compute_aggregate_window_function<br/>[window.rs:2111]"]
    end
    
    subgraph "Specialized Computations"
        ComputeLeadLag["compute_lead_lag<br/>[window.rs:1541]"]
        ComputeNtile["compute_ntile<br/>[window.rs:1585]"]
        ComputeRank["compute_rank<br/>[window.rs:1639]"]
        ComputePercentRank["compute_percent_rank<br/>[window.rs:1821]"]
        ComputeCumeDist["compute_cume_dist<br/>[window.rs:1859]"]
    end
    
    subgraph "Support Functions"
        PrecomputeOrderBy["precompute_order_by_values<br/>[window.rs:1967]"]
        SortByOrderValues["sort_by_order_values<br/>[window.rs:2092]"]
        ParseSelectList["parse_select_list_for_window<br/>[window.rs:748]"]
    end
    
    Query -->|"has window functions?"| HasWindow
    HasWindow -->|"yes"| ExecWindow
    ExecWindow --> ParseWF
    
    ExecWindow --> ComputeWF
    ExecPresorted --> ComputeWF
    ExecPregrouped --> ComputeWF
    ExecStreaming --> ComputePartition
    ExecLazy --> ComputePartition
    
    ComputeWF -->|"partition rows"| ComputePartition
    ComputeWF -->|"aggregate functions"| ComputeAggWF
    
    ComputePartition --> PrecomputeOrderBy
    ComputePartition --> SortByOrderValues
    ComputePartition --> ComputeLeadLag
    ComputePartition --> ComputeNtile
    ComputePartition --> ComputeRank
    
    ComputeAggWF --> PrecomputeOrderBy
    ComputeAggWF --> SortByOrderValues
    
    ExecWindow --> ParseSelectList
    ExecStreaming --> ParseSelectList
    ExecLazy --> ParseSelectList

Execution Pipeline

graph TD
    Start["Query with Window Function"] --> Detect["Detect Window Functions<br/>[window.rs:2522-2536]"]
    Detect --> Parse["Parse Window Functions<br/>[window.rs:974-1031]<br/>Extract WindowFunctionInfo"]
    
    Parse --> OptCheck{"Optimization<br/>Eligible?"}
    
    OptCheck -->|"PARTITION BY + LIMIT<br/>+ indexed partition col"| LazyPartition["Lazy Partition Fetch<br/>[window.rs:591-745]<br/>Fetch partitions one-by-one<br/>Stop at LIMIT"]
    
    OptCheck -->|"No PARTITION BY<br/>+ indexed ORDER BY col"| PreSorted["Pre-sorted Optimization<br/>[window.rs:135-151]<br/>Skip sorting"]
    
    OptCheck -->|"PARTITION BY<br/>+ indexed partition col"| PreGrouped["Pre-grouped Optimization<br/>[window.rs:156-172]<br/>Skip hash grouping"]
    
    OptCheck -->|"PARTITION BY + LIMIT"| Streaming["Streaming with LIMIT<br/>[window.rs:422-586]<br/>Process partitions until LIMIT"]
    
    OptCheck -->|"Standard case"| Standard["Standard Execution<br/>[window.rs:115-131]"]
    
    LazyPartition --> BuildResult["Build Result Rows<br/>[window.rs:352-417]"]
    PreSorted --> Compute["Compute Window Values<br/>[window.rs:1216-1416]"]
    PreGrouped --> Compute
    Streaming --> BuildResult
    Standard --> Compute
    
    Compute --> Partition{"Has<br/>PARTITION BY?"}
    
    Partition -->|"No"| SinglePartition["Single Partition<br/>[window.rs:1250-1293]<br/>All rows together"]
    Partition -->|"Yes"| MultiPartition["Hash-based Partitioning<br/>[window.rs:1295-1354]<br/>Group by partition key"]
    
    SinglePartition --> ComputePartition["compute_window_for_partition<br/>[window.rs:1423-1538]"]
    MultiPartition -->|"≥10 partitions<br/>+ ≥1000 rows"| Parallel["Parallel Partition Processing<br/>[window.rs:1359-1390]<br/>Rayon parallel_iter"]
    MultiPartition -->|"Small dataset"| Sequential["Sequential Processing<br/>[window.rs:1391-1415]"]
    
    Parallel --> ComputePartition
    Sequential --> ComputePartition
    
    ComputePartition --> FuncType{"Function<br/>Type?"}
    
    FuncType -->|"Aggregate<br/>(SUM, COUNT, etc.)"| AggCompute["compute_aggregate_window_function<br/>[window.rs:2111-2520]<br/>With frame bounds"]
    FuncType -->|"LEAD/LAG"| LeadLag["compute_lead_lag<br/>[window.rs:1541-1582]"]
    FuncType -->|"NTILE"| Ntile["compute_ntile<br/>[window.rs:1585-1636]"]
    FuncType -->|"RANK/DENSE_RANK"| Rank["compute_rank<br/>[window.rs:1639-1693]"]
    FuncType -->|"Other ranking"| Other["ROW_NUMBER, etc.<br/>[window.rs:1530-1532]"]
    
    AggCompute --> BuildResult
    LeadLag --> BuildResult
    Ntile --> BuildResult
    Rank --> BuildResult
    Other --> BuildResult
    
    BuildResult --> End["Return ExecutorMemoryResult<br/>or streaming result"]

Syntax

function_name([expression]) OVER (
    [PARTITION BY partition_expression [, ...]]
    [ORDER BY sort_expression [ASC | DESC] [, ...]]
)

PARTITION BY: Divides rows into groups (partitions) that share the same values
ORDER BY: Defines the order of rows within each partition

Available Window Functions

Ranking Functions

ROW_NUMBER()

Assigns a unique sequential number to each row within a partition.

SELECT name, dept, salary,
       ROW_NUMBER() OVER (ORDER BY salary DESC) as row_num
FROM employees;

Result:

name   | dept        | salary | row_num
-------+-------------+--------+--------
Diana  | Engineering | 80000  | 1
Frank  | Engineering | 75000  | 2
Charlie| Engineering | 70000  | 3
Bob    | Sales       | 60000  | 4
Eve    | Sales       | 55000  | 5
Alice  | Sales       | 50000  | 6

RANK()

Assigns a rank to each row within a partition. Rows with equal values receive the same rank, with gaps in the sequence.

SELECT name, salary,
       RANK() OVER (ORDER BY salary DESC) as rank
FROM employees;

If two employees have the same salary and are ranked 2, the next rank would be 4 (not 3).

DENSE_RANK()

Similar to RANK(), but without gaps in the ranking sequence.

SELECT name, salary,
       DENSE_RANK() OVER (ORDER BY salary DESC) as dense_rank
FROM employees;

If two employees have the same salary and are ranked 2, the next rank would be 3.

NTILE(n)

Distributes rows into a specified number of buckets.

SELECT name, salary,
       NTILE(3) OVER (ORDER BY salary DESC) as tertile
FROM employees;

Divides employees into 3 groups based on salary.

PERCENT_RANK()

Returns the relative rank of the current row as a percentage (0 to 1).

SELECT name, salary,
       PERCENT_RANK() OVER (ORDER BY salary) as pct_rank
FROM employees;

CUME_DIST()

Returns the cumulative distribution of a value (fraction of rows with values <= current row).

SELECT name, salary,
       CUME_DIST() OVER (ORDER BY salary) as cume_dist
FROM employees;

Value Functions

LAG(expression [, offset [, default]])

Accesses a value from a previous row within the partition.

-- Get previous row's salary
SELECT name, salary,
       LAG(salary) OVER (ORDER BY salary) as prev_salary
FROM employees;

-- Get salary from 2 rows back, with default of 0
SELECT name, salary,
       LAG(salary, 2, 0) OVER (ORDER BY salary) as prev2_salary
FROM employees;

LEAD(expression [, offset [, default]])

Accesses a value from a following row within the partition.

-- Get next row's salary
SELECT name, salary,
       LEAD(salary) OVER (ORDER BY salary) as next_salary
FROM employees;

-- Get salary from 2 rows ahead, with default of 0
SELECT name, salary,
       LEAD(salary, 2, 0) OVER (ORDER BY salary) as next2_salary
FROM employees;

FIRST_VALUE(expression)

Returns the first value within the partition.

SELECT name, dept, salary,
       FIRST_VALUE(salary) OVER (PARTITION BY dept ORDER BY salary DESC) as max_in_dept
FROM employees;

LAST_VALUE(expression)

Returns the last value within the current window frame.

SELECT name, dept, salary,
       LAST_VALUE(salary) OVER (PARTITION BY dept ORDER BY salary DESC) as current_last
FROM employees;

NTH_VALUE(expression, n)

Returns the nth value within the partition.

SELECT name, salary,
       NTH_VALUE(salary, 2) OVER (ORDER BY salary DESC) as second_highest
FROM employees;

Using PARTITION BY

PARTITION BY divides the result set into partitions, and window functions are applied to each partition separately.

-- Row numbers within each department
SELECT name, dept, salary,
       ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) as dept_rank
FROM employees
ORDER BY dept, dept_rank;

Result:

name   | dept        | salary | dept_rank
-------+-------------+--------+----------
Diana  | Engineering | 80000  | 1
Frank  | Engineering | 75000  | 2
Charlie| Engineering | 70000  | 3
Bob    | Sales       | 60000  | 1
Eve    | Sales       | 55000  | 2
Alice  | Sales       | 50000  | 3

Aggregate Functions as Window Functions

Standard aggregate functions can also be used as window functions:

-- Running total
SELECT name, salary,
       SUM(salary) OVER (ORDER BY salary) as running_total
FROM employees
ORDER BY salary;

Result:

name   | salary | running_total
-------+--------+--------------
Alice  | 50000  | 50000
Eve    | 55000  | 105000
Bob    | 60000  | 165000
Charlie| 70000  | 235000
Frank  | 75000  | 310000
Diana  | 80000  | 390000

Common Use Cases

Top N per Group

Find the highest paid employee in each department:

SELECT * FROM (
    SELECT name, dept, salary,
           ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) as rn
    FROM employees
) ranked
WHERE rn = 1;

Running Totals

Calculate cumulative sales:

SELECT order_date, amount,
       SUM(amount) OVER (ORDER BY order_date) as cumulative_sales
FROM orders;

Compare to Previous/Next

Calculate month-over-month change:

SELECT month, revenue,
       revenue - LAG(revenue) OVER (ORDER BY month) as change
FROM monthly_sales;

Percentile Calculation

Find employees in the top 10% of salaries:

SELECT * FROM (
    SELECT name, salary,
           PERCENT_RANK() OVER (ORDER BY salary DESC) as pct
    FROM employees
) ranked
WHERE pct <= 0.10;

Notes

When ORDER BY is omitted, the entire partition is treated as a single group
NULL values are handled according to SQL standards
Window functions can only appear in SELECT and ORDER BY clauses
Multiple window functions can be used in the same query