Compute a Difference (Delta) Between Two Columns on Different Rows
MySQL is an open-source, Relational Database Management System that stores data in a structured format using rows and columns. It’s software that enables users to create, manage, and manipulate databases. Similar to SQL we use queries to store and access data in the MySQL databases.
In this article, we’ll learn how to calculate the difference (delta) between two columns on different rows. Before moving into this article, you need to install MySQL. To do so you can refer to the below articles,
Computing Difference (Delta) Between Two Columns on Different Rows
In MySQL, computing the difference between two columns on different rows can be done in multiple ways,
- Using Window Functions
- Using SELF JOIN
- Using Subqueries
- Using Common Table Expressions (CTEs)
First, we’ll create a table named Beginner_table with two columns geek_id, and geek_value.
CREATE TABLE Beginner_table(
geek_id SERIAL PRIMARY KEY,
geek_value INT
);
Now we’ll insert values 100,200,300,400,500 into the table
INSERT INTO Beginner_table(geek_value) VALUES (100),(200),(300),(400),(500);
After execution of the above commands, Beginner_table is ready with some data which is helpful for further computations. Now our table looks like,
geek_id |
geek_value |
---|---|
1 |
100 |
2 |
200 |
3 |
300 |
4 |
400 |
5 |
500 |
Now here are some examples of computing the difference or delta between ‘geek_value‘ columns of different rows using ‘Beginner_table‘ which we have created above.
1. Using Window Functions
Window functions in MySQL allow for performing calculations across a set of rows related to the current row. Using LAG(), and LEAD() window functions, we can compute the delta between values in two columns on different rows.
Use the below MySQL command to compute the delta between two columns on different rows using window functions.
1) Using LAG() window function:
SELECT
geek_value,
geek_value - LAG(geek_value) OVER ( ORDER BY geek_id ) AS Delta
FROM
Beginner_table;
Output:
geek_value |
Deta |
---|---|
100 |
NULL |
200 |
100 |
300 |
100 |
400 |
100 |
500 |
100 |
Explanation: LAG(geek_value) OVER (ORDER BY geek_id): This part of the query calculates the value of the ‘geek_value‘ column from the previous row, ordered by the ‘geek_id‘ column. The LAG() function retrieves the value from the previous row based on the specified ordering. If there’s no previous row, it returns NULL.
2) Using LEAD() window function:
SELECT
geek_value,
geek_value - LEAD(geek_value) OVER ( ORDER BY geek_id ) AS Delta
FROM
Beginner_table;
Output:
geek_value |
Delta |
---|---|
100 |
-100 |
200 |
-100 |
300 |
-100 |
400 |
-100 |
500 |
NULL |
Explanation: LEAD(geek_value) OVER (ORDER BY geek_id): This part of the query calculates the value of the ‘geek_value‘ column from the next or succeeding row, ordered by the geek_id column. The LEAD() function retrieves the value from the next or succeeding row based on the specified ordering. If there’s no succeeding row, it returns NULL.
2. Using SELF JOIN
Another approach for computing the delta between two columns on different rows is, by using self-join to retrieve the value of the previous row and then calculate the value of the difference or delta.
Use the below command to compute the delta between two columns on different rows using self-join.
SELECT
A.geek_value,
A.geek_value - B.geek_value AS Delta
FROM
Beginner_table A
JOIN
Beginner_table B ON A.geek_id = B.geek_id + 1;
Output:
geek_value |
delta |
---|---|
200 |
100 |
300 |
100 |
400 |
100 |
500 |
100 |
Explanation:
- SELECT A.geek_value,A.geek_value – B.geek_value AS Delta: This part of the query selects the table’s first occurrence of Beginner_table (A) geek_value column and then formulates the difference between the current row value (A.geek_value) and previous row value (B.geek_value).
- JOIN Beginner_table B ON A.geek_id = B.geek_id + 1: This part of the query does the inner join or self join of A and B in both the table. It combines rows where the geek_id of A is the same as the geek_id of B incremented by 1.
3. Using Subqueries
Another approach for computing the delta between two columns on different rows is, by using subqueries to retrieve the value of the previous row and then calculate the value of the difference or delta.
Use the below command to compute the delta between two columns on different rows using subqueries.
SELECT
A.geek_value,
A.geek_value - (SELECT B.geek_value FROM Beginner_table B WHERE B.geek_id + 1 = A.geek_id ) AS Delta
FROM
Beginner_table A;
Output:
geek_value |
Delta |
---|---|
100 |
NULL |
200 |
100 |
300 |
100 |
400 |
100 |
500 |
100 |
Explanation:
- (SELECT B.geek_value FROM Beginner_table B WHERE B.geek_id + 1 = A.geek_id ): This kind of subqueries is called correlated subqueries. For every row value (A.geek_value) of Beginner_table(A), this subquery returns the previous row value(B.geek_value) of Beginner_table(B) by using the subsequent geek_id of the Beginner_table.
- SELECT A.geek_value, A.geek_value – (SELECT B.geek_value FROM Beginner_table B WHERE B.geek_id + 1 = A.geek_id ) AS Delta: This part of the query computes the difference between the first occurrence of Beginner_table(A) geek_value column and value returned by subquery as aliases it as Delta.
4. Using Common Table Expressions (CTEs)
Other ways of computing delta between two columns on multiple rows is, by using common table expressions. CTEs are helpful in defining temporary result sets that can be referenced within a MySQL query.
Use the below command to compute the difference between two columns on multiple rows using CTE.
WITH previous_values AS (
SELECT
geek_id,
geek_value,
LAG(geek_value) OVER (ORDER BY geek_id) AS previous_value
FROM
Beginner_table
)
SELECT
geek_id,
geek_value,
geek_value - previous_value AS Delta
FROM
previous_values;
Output:
geek_id |
geek_value |
Delta |
---|---|---|
1 |
100 |
NULL |
2 |
200 |
100 |
3 |
300 |
100 |
4 |
400 |
100 |
5 |
50 |
100 |
Explanation:
- WITH previous_values AS (….): This part of the query defines a Common Table Expression (CTE) named previous_values. CTEs allow you to create temporary result sets that can be referenced within the same query.
- SELECT geek_id, geek_value, LAG(geek_value) OVER (ORDER BY geek_id) AS previous_value FROM Beginner_table: Inside the CTE, this query selects three columns (geek_id, geek_value, and value of previous_value) from the Beginner_table. The LAG function retrieves the value of geek_value from the previous row based on the ordering specified by geek_id.
- SELECT geek_id, geek_value, geek_value – previous_value AS Delta FROM previous_values: This part of the query selects columns geek_id and geek_value from the previous_values CTE, and calculates the difference (delta) between the current geek_value and the previous row’s geek_value, and aliases the result as Delta.
Let us look into an example use case of yearly sales data, where we compute the difference in sales between each year using the above-said concept. The table considered here is ‘sales_table‘ having columns for date and sales. Now we compute the difference (delta) between sales of each year naming the column as Delta.
The table considered below is sales_table,
Date |
sales |
---|---|
2019-02-11 |
3000 |
2020-03-14 |
2000 |
2021-01-24 |
6000 |
2022-04-05 |
3500 |
2023-06-18 |
5000 |
Now let us compute the difference between sales of each year as Delta using the window function LAG(),
SELECT
Date,
sales - LAG(sales) OVER ( ORDER BY Date ) AS Delta
FROM
sales_table;
Output:
Date |
Delta |
---|---|
2019-02-11 |
NULL |
2020-03-14 |
-1000 |
2021-01-24 |
4000 |
2022-04-05 |
-2500 |
2023-06-18 |
1500 |
Conclusion
Therefore, In MySQL, we can compute the difference (delta) between two columns on different rows using window functions, self-join, subqueries, and CTEs common table expressions. Each approach may vary in computation speed and methodology. So one can choose ones own approach according to the requirement.