The ROW_NUMBER()
function in MySQL is a window function that assigns a unique, sequential integer to rows within a result set, starting from 1 for the first row and incrementing by 1 for each subsequent row. The rows are numbered according to the order specified in the query using the ORDER BY
clause.
Syntax
ROW_NUMBER() OVER ([PARTITION BY partition_expression] ORDER BY order_expression)
Parameters:
PARTITION BY partition_expression
(Optional):
Divides the result set into partitions to which theROW_NUMBER()
function is applied. The row number is calculated for each partition separately.ORDER BY order_expression
:
Specifies the order in which the rows are numbered within each partition (or the entire result set ifPARTITION BY
is omitted).
Important Points:
- No
PARTITION BY
: If thePARTITION BY
clause is omitted, the row numbers will be assigned sequentially across the entire result set. - Sequential Numbering: The numbering starts from
1
for the first row in each partition and increments by 1 for each subsequent row.
Example 1: Basic ROW_NUMBER()
without PARTITION BY
SELECT
id,
name,
ROW_NUMBER() OVER (ORDER BY id) AS row_num
FROM employees;
In this example:
- The query assigns a sequential number (
row_num
) to each employee in theemployees
table. - The numbering is based on the
id
column in ascending order.
Example 2: ROW_NUMBER()
with PARTITION BY
SELECT
department,
id,
name,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY name) AS row_num
FROM employees;
In this example:
- The query assigns a row number (
row_num
) to each employee, but resets the numbering for each department (due toPARTITION BY department
). - The row numbers within each department are assigned in alphabetical order based on the
name
column.
Example 3: Using ROW_NUMBER()
to Get Top N Records Per Group
Suppose you want to get the top 2 employees based on salary from each department. You can use ROW_NUMBER()
to rank employees within each department:
SELECT
department,
id,
name,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees
WHERE row_num <= 2;
Explanation:
PARTITION BY department
ensures that the ranking starts over for each department.ORDER BY salary DESC
ranks employees in each department from the highest to lowest salary.- The
WHERE row_num <= 2
clause filters out only the top 2 employees from each department.
Example 4: ROW_NUMBER()
with LIMIT
(Paginated Results)
You can use ROW_NUMBER()
for pagination by selecting rows within a specific range.
WITH ranked_employees AS (
SELECT
id,
name,
ROW_NUMBER() OVER (ORDER BY id) AS row_num
FROM employees
)
SELECT id, name
FROM ranked_employees
WHERE row_num BETWEEN 11 AND 20;
In this example:
- The
ROW_NUMBER()
assigns a sequential number to each employee, ordered byid
. - The query retrieves employees whose row numbers are between 11 and 20, essentially providing the 2nd page of results (assuming 10 items per page).
Key Use Cases of ROW_NUMBER()
- Ranking and Sorting:
You can assign ranks to rows and then use those ranks to filter or sort results. - Pagination:
Useful for fetching specific subsets of results, such as in paginated web applications. - Deduplication:
You can useROW_NUMBER()
in conjunction with aCTE
(Common Table Expression) or subquery to remove duplicates by keeping only the first row in each group.Example of removing duplicates:
WITH ranked AS ( SELECT id, name, ROW_NUMBER() OVER (PARTITION BY name ORDER BY id) AS row_num FROM employees ) SELECT id, name FROM ranked WHERE row_num = 1;
Performance Considerations
- Window functions like
ROW_NUMBER()
can be computationally expensive, especially on large result sets. You should ensure the underlying tables are indexed appropriately to improve performance. - The
PARTITION BY
clause can be particularly costly in terms of performance since the database needs to group the rows based on the partition expression before applying the function.
Conclusion
The ROW_NUMBER()
function is a powerful tool in MySQL (and other databases that support window functions) for generating row numbers based on specific ordering, which can be useful for ranking, pagination, and handling grouped data.