In SQL, grouping data is a powerful operation that allows you to aggregate and analyse data effectively. This is especially useful when you want to generate aggregated data or perform calculations. You can also filter the returned aggregations based on the aggregated results rather than individual row details. This cannot be done with the WHERE clause, instead, we need to use the HAVING clause to filter grouped data. In this article, we will explore the SQL group by and SQL having clause.
Group by and having clause table of contents
Learn SQL online
SQL group by Syntax
Grouping data in SQL involves categorizing rows in a table into subsets based on the values in one or more columns. Once the data is grouped, you can perform aggregate functions like COUNT, SUM, AVG, MAX, and MIN on these subsets to retrieve meaningful information. You may for example want to COUNT the number of sales in a particular month, in which case you would need to group the data by month and then get the total count for each month.
The SQL query to group data uses the GROUP BY clause, which specifies the columns by which you want to group your data. For example:
SELECT column1, column2, COUNT(*)
FROM table_name
GROUP BY column1, column2;
This query groups the data in “table_name” by the values in “column1” and “column2” and calculates the count of rows in each group. Note that the GROUP BY clause must be placed after the WHERE clause as the row level data is filtered before the GROUP BY condition is applied. In addition you can only SELECT fields that are contained withing the GROUP BY clause or use aggregated functions. In the above example the COUNT function specifies the wildcard * that indicates you want to count the entire row. You can also specify individual columns to count. As well as the COUNT aggregate function you can also use the SUM, AVG, MIN, and MAX functions in the same query. Each column must be sepperated by a comma.
SQL group by examples
Let’s consider a practical example. Imagine we have a “sales” table that contains information about sales transactions, including the product ID, salesperson ID, and sale amount. We want to find the total sales for each salesperson. Here’s how we can do it:
SELECT salesperson_id, SUM(sale_amount) as total_sales
FROM sales
GROUP BY salesperson_id;
In this example, we group the data by “salesperson_id” and calculate the total sales for each salesperson. This gives us a list of salespeople and their total sales. Notice that we have also introduced a column alias “total_sales”. We can do this by specifying the keyword as and then giving the name that we want the column to have. This can be done with any column in the SELECT statement but you must use the original column name in the GROUP by clause.
Now, let’s say we want to filter this result to only show salespeople with total sales exceeding a specific threshold, such as £10,000. We can achieve this using the HAVING clause. The HAVING clause is placed after the GROUP BY clause and acts in a simular way to the WHERE clause but applies the filter to the entire group. To use the HAVING clause you need to specify the aggregation that you want to filter on. In the example below we have used the SUM aggregation but you can use any of the other aggregations available.
SELECT salesperson_id, SUM(sale_amount) as total_sales
FROM sales
GROUP BY salesperson_id
HAVING SUM(sale_amount) > 10000;
The HAVING clause is used to filter the grouped data after the aggregation is performed. In this case, it filters out salespeople with total sales greater than £10,000, giving us a list of high-performing salespeople.
Conclusion
Grouping data in SQL is a valuable skill when you need to analyze and summarize information within a database. By using the GROUP BY clause, you can categorize data into subsets, and with aggregate functions, you can calculate relevant statistics for each group. The HAVING clause is a powerful tool to further filter grouped data based on specific conditions, allowing you to extract meaningful insights from your data.
In this article, we’ve covered the theory of grouping data and provided practical examples of how to use the GROUP BY clause and the HAVING clause to achieve your data analysis goals. By mastering these SQL techniques, you can efficiently extract, summarize, and filter data in your database, gaining valuable insights for decision-making and reporting.
Please see the other SQL learning articles below as part of this learning SQL online course.
If you would like an instructor lead SQL course that goes into more detail with worked examples then we also provide a video course to learn SQL online hosted on UDEMY.