List the Last 25% Rows in a Result Set in PostgreSQL

In PostgreSQL, extracting specific portions of a result set can be achieved using a variety of SQL techniques. One common requirement is to retrieve the last 25% of rows from a result set. This can be useful for various purposes, such as paginating results, performing analyses on a subset of data, or optimizing queries for performance.

List the Last 25% Rows in a Result Set in PostgreSQL

This article will guide you through different methods to list the last 25% of rows in a result set in PostgreSQL, including:

  1. Row Number with CTE
  2. Subquery with Row Number
  3. Subquery with offset and limit
  4. Dense Function with CTE

Set Up an Environment

For that first, we will create a table in pgAdmin. For the demo, we will make an employee table.

CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
date_of_birth DATE NOT NULL,
position VARCHAR(50) NOT NULL,
department VARCHAR(50) NOT NULL,
hire_date DATE NOT NULL,
salary NUMERIC(10, 2) NOT NULL
);

Inserting the value in the employee table:

INSERT INTO employees (first_name, last_name, date_of_birth, position, department, hire_date, salary)
VALUES
('John', 'Doe', '1985-01-15', 'Software Engineer', 'IT', '2020-05-01', 85000.00),
('Jane', 'Smith', '1990-03-22', 'Project Manager', 'IT', '2019-08-15', 95000.00),
('Emily', 'Jones', '1982-07-09', 'HR Manager', 'HR', '2018-11-01', 90000.00),
('Michael', 'Brown', '1975-02-28', 'Chief Executive Officer', 'Management', '2015-04-12', 150000.00),
('Sarah', 'Davis', '1995-05-17', 'Marketing Specialist', 'Marketing', '2021-01-25', 60000.00),
('David', 'Wilson', '1988-10-10', 'Financial Analyst', 'Finance', '2017-09-15', 78000.00),
('Laura', 'Martinez', '1979-11-22', 'Legal Counsel', 'Legal', '2016-03-30', 110000.00),
('Robert', 'Garcia', '1983-06-05', 'Operations Manager', 'Operations', '2019-12-01', 92000.00),
('Jessica', 'Taylor', '1992-08-14', 'Sales Representative', 'Sales', '2020-02-20', 65000.00),
('Daniel', 'Lee', '1987-09-25', 'Business Analyst', 'IT', '2018-06-10', 82000.00),
('Karen', 'Anderson', '1991-04-30', 'Product Manager', 'Product', '2021-07-18', 93000.00),
('Brian', 'Thomas', '1980-12-12', 'Chief Financial Officer', 'Finance', '2016-10-20', 140000.00);

Output:

Table Data

1. Row Number with CTE

Basically this is the simplest approach because in that we will give the index or id to the whole data set and we will get only the result set which is required. Which means using this we can also first 25% rows.

WITH total_count AS (
SELECT COUNT(*) AS cnt
FROM employees
),
last_25_percent AS (
SELECT *,
ROW_NUMBER() OVER (ORDER BY employee_id) AS row_num
FROM employees
)
SELECT *
FROM last_25_percent, total_count
WHERE row_num > (cnt* 3.0 / 4.0)

Output:

CTE with Row Number

Explanation: In first CTE we just take the total count of rows. While in second CTE we have given row number to all the rows and then in main query applied where condition in which only the rows more than 3/4th will be shown.

2. Subquery with Row Number

Now we will try to make the above query with subquery rather than CTE for that we will replace the CTE with subquery while all the other steps would be same.

SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (ORDER BY employee_id) AS row_num,
(SELECT COUNT(*) FROM employees) AS cnt
FROM employees
) sub
WHERE row_num > (cnt * 3.0 / 4.0)

Output:

Subquery with Row Number

Explanation: For above output we have taken two subquery from which the innermost query returns the cnt column which contains the total rows. While the row_num is coming from the second Subquery.

3. Subquery with offset and limit

Now we can also achieve this thing with Subquery without using Row Number but in that we will be using offset and limit. In which offset will be used to mark the row from which the reading of data will be started and Fetch will take the next and return it to the result set.

SELECT *
FROM employees
OFFSET (SELECT COUNT(*) * 3 / 4 FROM employees)
FETCH NEXT (SELECT COUNT(*) / 4 FROM employees) ROWS ONLY;

Output:

Subquery with Offset and Limit

Explanation: In this we have applied the same subquery twice over here. For offset it will assign the number to each of the data. And fetch will only fetch the required rows.

4. Dense Function with CTE

Dense function is nothing but a Dense_Rank() function is used to give the same rank to the rows if it contains the same value in result set. We will replace the Rank() from the approach 1.

WITH total_count AS (
SELECT COUNT(*) AS cnt
FROM employees
),
ranked_rows AS (
SELECT *,
DENSE_RANK() OVER (ORDER BY employee_id) AS dense_rank
FROM employees
)
SELECT *
FROM ranked_rows, total_count
WHERE dense_rank > (cnt * 3.0 / 4.0)

Output:

Dense Function with CTE

Explanation: Dense_rank is nothing but the rank given to the row and cnt will be the total count which is coming from the total_count CTE.

Conclusion

These are some commonly used approaches to list the last 25% of rows in a result set in PostgreSQL. By using techniques such as CTEs, subqueries, and window functions like ROW_NUMBER() and DENSE_RANK(), you can efficiently fetch specific portions of your data for analysis or other purposes. Experiment with these methods to find the one that best fits your use case and performance requirements.