How to group by in SQL
Learn how to group and aggregate data effectively using SQL’s GROUP BY
clause.
When working with databases, it’s often necessary to analyze and summarize data rather than just list it. Whether you’re building reports, dashboards, or performing ad-hoc data analysis, the SQL GROUP BY
clause is an indispensable tool. It enables you to split data into groups and apply aggregate functions to each group, allowing for powerful insights.
In this article, we will explain how GROUP BY
works, walk through various use cases, and provide examples to help you understand how and when to use it effectively.
Learn SQL online
What Is the GROUP BY Clause?
The GROUP BY
clause is used in conjunction with aggregate functions like:
COUNT()
– counts the number of rowsSUM()
– adds up numeric valuesAVG()
– calculates the averageMIN()
/MAX()
– finds the smallest/largest value
When you GROUP BY
one or more columns, SQL groups rows that have the same values in those columns, and then performs aggregate calculations for each group.
Example Table: SALES
For illustration, let’s assume we have a table called sales
with the following schema and data:
CREATE TABLE sales (
id INT,
product VARCHAR(50),
region VARCHAR(50),
quantity INT,
price DECIMAL(10, 2)
);
Sample data:
id | product | region | quantity | price |
---|---|---|---|---|
1 | Laptop | West | 3 | 900.00 |
2 | Monitor | East | 5 | 200.00 |
3 | Laptop | East | 2 | 950.00 |
4 | Keyboard | West | 7 | 50.00 |
5 | Monitor | West | 3 | 210.00 |
6 | Laptop | East | 1 | 970.00 |
Basic GROUP BY Usage
Let’s say we want to find the total number of items sold for each product.
SELECT product, SUM(quantity) AS total_quantity
FROM sales
GROUP BY product;
Explanation:
GROUP BY product
groups the rows by the product name.SUM(quantity)
computes the total quantity for each product.
Result:
product | total_quantity |
---|---|
Laptop | 6 |
Monitor | 8 |
Keyboard | 7 |
This shows us that:
6 laptops were sold,
8 monitors were sold,
and 7 keyboards were sold.
Grouping by Multiple Columns
We can also group by more than one column. For instance, if we want to know how many units of each product were sold in each region:
SELECT product, region, SUM(quantity) AS total_quantity
FROM sales
GROUP BY product, region;
Explanation:
The result will have a row for each unique combination of
product
andregion
.
Result:
product | region | total_quantity |
---|---|---|
Laptop | West | 3 |
Laptop | East | 3 |
Monitor | East | 5 |
Monitor | West | 3 |
Keyboard | West | 7 |
This more granular view is useful for regional breakdowns, common in reporting and dashboarding.
Counting Records Per Group
To find how many transactions were recorded for each product (regardless of how many units were sold), you can use COUNT(*)
:
SELECT product, COUNT(*) AS num_sales
FROM sales
GROUP BY product;
Result:
product | num_sales |
---|---|
Laptop | 3 |
Monitor | 2 |
Keyboard | 1 |
This tells us:
There were 3 laptop sales transactions,
2 monitor sales,
and 1 keyboard sale.
This is especially helpful for understanding the volume of activity for different product lines.
Filtering Grouped Results with HAVING
Unlike the WHERE
clause, which filters rows before grouping, the HAVING
clause filters the grouped results after aggregation.
For example, let’s say we only want to see products that sold more than 6 units in total:
SELECT product, SUM(quantity) AS total_quantity
FROM sales
GROUP BY product
HAVING SUM(quantity) > 6;
Result:
product | total_quantity |
---|---|
Monitor | 8 |
Keyboard | 7 |
This is useful for highlighting high-performing categories or outliers.
Combining GROUP BY with ORDER BY
You can sort the grouped results using ORDER BY
, which is useful for ranking or prioritizing outputs.
Example: Sort regions by total quantity sold in descending order.
SELECT region, SUM(quantity) AS total_sold
FROM sales
GROUP BY region
ORDER BY total_sold DESC;
Result:
region | total_sold |
---|---|
West | 13 |
East | 8 |
This provides insight into which regions are the most active or profitable.
GROUP BY Best Practices
Every selected column must be either in the
GROUP BY
clause or used in an aggregate function.
Otherwise, SQL will throw an error (or in some dialects, return unpredictable results).Avoid using
SELECT *
when usingGROUP BY
. Always specify columns clearly.Index the columns used in
GROUP BY
if working with large datasets to optimize performance.Use
HAVING
only when filtering aggregated values. UseWHERE
to filter raw data before grouping.
Summary Table
Clause | Purpose |
---|---|
GROUP BY | Groups rows sharing the same values into buckets |
SUM() / AVG() / COUNT() | Perform calculations on each group |
HAVING | Filters groups based on aggregate conditions |
ORDER BY | Sorts the grouped output |
Summary
Mastering GROUP BY
will transform the way you interact with SQL databases. It unlocks the ability to generate detailed reports, track performance metrics, and analyze complex datasets. Whether you’re monitoring sales by region, calculating user engagement by channel, or summarizing event logs by type—GROUP BY
is your go-to tool
Please see the other SQL video learning articles below as part of this learning SQL online course video.
If you would like an instructor lead SQL course that goes into more detail with worked examples then we also provide a video course to learn SQL online hosted on UDEMY.