Today, I am going to explain how to do a group by clause in your SQL queries. SQL (Structured Query Language) is a very popular query language that is widely used to access and manipulate records stored in a database. The GROUP BY clause is also a SQL statement that is used to group identical values. Let’s first understand this clause what it does and why we use this clause in SQL queries.
The GROUP BY clause is used to group the same values and return a summary result. If a column that is used in the GROUP BY statement has NULL values, then this statement will consider them equal and group all NULL values in a single group.
We use this clause in with the SELECT statement and specify it after the WHERE clause. If you want to use ORDER BY statements in the same query, then you should place a GROUP BY statement before ORDER BY clauses. We also need to specify aggregate functions like COUNT, MIN, MAX, SUM, and AVG along with column names to get the desired output.
If you want to learn how to do a group by clause, then your next step should be to learn its syntax which is given below.
1 2 3 4 5 6 |
SELECT ColumnName, FunctionName FROM TableName WHERE Condition GROUP BY ColumnName |
- ColumnName is the name of the columns which you want to return in your output, or you want to specify in the GROUP BY statement to group the identical values
- FunctionName is the aggregate function that you can use in a SQL query like COUNT(), MIN(), MAX(), SUM(), and AVG()
- TableName is the name of the table from which you want to return the result set in your output
- Condition is the statement that is used in the WHERE clause to filter output data
Next, I will demonstrate how to do a group by clause with the help of a few use cases in the below section.
Use Cases
This section will explain various use cases of using group by clause. I will use a table “Resources” to demonstrate all these use cases. This table stores employee details like their name, id, city, designation, salary, and age. I will show you how to do a group by clause with several other SQL statements to get our desired results.
Let me show you the table content by running the below query so that you can cross-check and validate the output results returned by the various use cases.
1 2 3 4 |
SELECT * FROM [DemoDB].[dbo].[Resources] |
Here is the screenshot of the output returned by the above query.
Let’s start with its basic use case.
Basic use case
This section will help you understand its basic use. Here, I have returned the total number of employees based on their employment nature. There are two values of employment nature, one is permanent and another one is contractual.
I have executed the below query in which I have used the employment nature column “EmpNature” and an aggregated function COUNT() in the SELECT statement and then added a GROUP BY clause after FROM statement on the same column “EmpNature” to group its similar values.
1 2 3 4 5 |
SELECT EmpNature, COUNT(*) AS [No. of Resources] FROM [DemoDB].[dbo].[Resources] GROUP BY EmpNature |
I got the below result after running the above query. Here we can see the number of resources working in the company as permanent and on contract.
Let me show you how to do a group by clause with multiple columns. Let’s take the last example as a reference and add additional columns in the same query. I have added the column “City” in the SELECT statement and GROUP BY clause to display the number of employees based on their city and employment nature.
1 2 3 4 5 |
SELECT EmpNature, City, COUNT(*) AS [No. of Resources] FROM [DemoDB].[dbo].[Resources] GROUP BY EmpNature, City |
The output of the above query has returned in the below image. Here, we can see GROUP BY statement has clubbed all numbers of employees based on their city and employment nature. We can get those 3 employees are working in Mumbai as permanent whereas London has 1 permanent and 1 contractual employee.
We can also use multiple functions with this statement. You can see the below example where I have used two aggregate functions COUNT() and AVG(). This will return the total number of employees and their average age based on their employment nature.
1 2 3 4 5 |
SELECT EmpNature, COUNT(*) AS [No. of Resources], AVG(Age) AS [Average Age] FROM [DemoDB].[dbo].[Resources] GROUP BY EmpNature |
The output is showing that 2 contractual employees are working and their average age is 25 whereas the average age of all 4 permanent employees is 27 years.
Use case with the WHERE clause
Here, I will show you how to do a group by clause with a WHERE statement. We use the WHERE clause to apply condition-based filtering to return our desired output result. When we use the GROUP BY clause along with the WHERE clause then it first removes rows that do not meet the conditions in the WHERE clause and then grouping operation performed on the filtered values.
I have taken the first use case example as a reference and added a condition in the WHERE clause. You can have a look at the below SQL statement which will return the total number of employees based out of Mumbai and group them based on their employment nature.
1 2 3 4 5 6 |
SELECT EmpNature, COUNT(*) AS [No. of Resources] FROM [DemoDB].[dbo].[Resources] WHERE City = 'Mumbai' GROUP BY EmpNature |
The output of the above query can be seen in the below image. Here, we can see 3 permanent employees in Mumbai and no contractual employees working in the Mumbai office. You can get it validated by looking at the source table output given above.
Use case with the ORDER BY statement
We can also use group by clause with an ORDER BY statement. This section will demonstrate how to do a group by clause with an ORDER BY statement. Here, I will add the ORDER BY statement after the GROUP BY clause as it is showing in the below query.
Again, I have taken the same query that I used in the first example and added the ORDER BY statement.
1 2 3 4 5 6 |
SELECT EmpNature, COUNT(Name) AS [Resources] FROM [DemoDB].[dbo].[Resources] GROUP BY EmpNature ORDER BY COUNT(Name) DESC |
You can see the output; this time first row is displaying for permanent employment nature whereas the second row is showing for contractual employment nature. It was in reverse order where permanent employment nature was shown in the second row in the above examples where the ORDER BY statement was not specified.
Use case with the HAVING clause
Here, we will learn how to do a group by clause with a HAVING statement. You can see two sets of queries.
- The first set will return all cities and the average age of the employees.
- The second query will demonstrate the use case of the HAVING clause in which output will return only those cities and the average salary of all employees where the average age of its employees is less than 25 years. We will use the HAVING clause to get this output.
1 2 3 4 5 6 7 8 9 10 11 |
SELECT City, AVG(Age) AS [AvgAge] FROM [DemoDB].[dbo].[Resources] GROUP BY City GO SELECT City, AVG(Salary) AS [Avg Salary] FROM [DemoDB].[dbo].[Resources] GROUP BY City HAVING AVG(Age)<25 |
Let’s understand the output of the above queries with the help of the below image. Here, we can see two sets of output, one for each respective query.
- The first set of results is showing all the cities and the average age of all employees
- The second result is showing only those cities and their average salary where the average age of all employees is less than 25 years.
Conclusion
This article has explained how to do a group by clause to get your desired output. The GROUP BY statement is used to group identical values from the specified column to get the desired output. I have showcased various use cases in which I demonstrated how to do a group by clause with the WHERE clause, ORDER BY statement, HAVING statement, and several other examples.
Thank you for reading this article and I would encourage you all to write your feedback in the comment section so that we can improve in a better way.
- Configure backup storage redundancy for Azure Cosmos DB account - May 15, 2024
- Difference between SQL SELECT UNIQUE and SELECT DISTINCT - January 16, 2024
- How to do a group by clause - April 12, 2023