LINQ is actually a shortened form of Language Integrate Query. LINQ defines features that can be used to retrieve data from any type of data source. This is why LINQ is most important because data itself is the basic foundation of any program and using LINQ, data can be easily retrieve from different types of Data Sources.
How to Use Group by in Linq?
In this post, I will explain the GROUP BY clause in LINQ to SQL. Grouping is a powerful feature that organizes a collection into groups, where each group is associated with a key. If you're a beginner, don't worry, this post is for you.
I have created two tables: Employee and Department.
Employee Table
Id | Name | Country | Address | Salary | Department |
---|---|---|---|---|---|
1 | Du monde entier | France | 67, rue des Cinquante Otages | 5000 | 1 |
2 | Chop-suey Chinese | France | Hauptstr. 29 | 2500 | 2 |
3 | France restauration | USA | 54, rue Royale | 2000 | 3 |
4 | Howard Snyder | UK | 2732 Baker Blvd. | 2400 | 2 |
5 | Carlos Hernández | USA | Carrera 22 con Ave. Carlos Soublette #8-35 | 3500 | 1 |
6 | France restauration | UK | 54, rue Royale | 3200 | 1 |
7 | John Steel | USA | 2732 Baker Blvd. | 2800 | 2 |
8 | Renate Messner | UK | Carrera 22 con Ave. Carlos Soublette #8-35 | 2700 | 3 |
9 | Horst Kloss | UK | Rua da Panificadora, 12 | 2400 | 3 |
10 | Guillermo Fernández | France | Carrera 22 con Ave. Carlos Soublette #8-35 | 3500 | 1 |
11 | Georg Pipps | France | 54, rue Royale | 3200 | 3 |
12 | Isabel de Castro | USA | Av. del Libertador 900 | 2800 | 3 |
13 | Bernardo Batista | UK | Grenzacherweg 237 | 2700 | 1 |
Department
Id | Department |
---|---|
1 | Marketing |
2 | Research and Development |
3 | Accounting and Finance |
I want to calculate the total salary that I'm paying to employees by department. How can we achieve this using the GROUP BY clause?
If we were to do it manually, we would take each record for the Marketing Department and then add up the total salary. Similarly, we would do the same for each record in the Research and Development department. Essentially, we first group the records by department and then sum the salary column within each group.
That's why we are going to use the GROUP BY clause.
Sql Query:
Select Department, SUM(Salary) as TotalSalary
from Employee
Group by Department
Linq Query:
var results = from r in Employees
group r by r.Department into gp
select new
{
DepartmentId = gp.Key,
TotalSalary = gp.Sum(a => a.Salary)
};
foreach (var r in results)
{
Debug.Print(($@"""{r.DepartmentId}"" ""{r.TotalSalary}"""));
}
Output:
Department | TotalSalary |
---|---|
1 | 17900 |
2 | 7700 |
3 | 13100 |
3 | 13100 |
This LINQ query groups the employee records by department and computes the total salary for each department, returning the results as a collection of anonymous objects with the department ID and total salary.
C# Linq Group By on Multiple Columns
Let's extend our analysis to not only consider the department but also include the country. For instance, in France, we aim to calculate the total salary distributed across various departments such as Marketing, Research and Development, Accounting, and Finance. This requires grouping by multiple columns, specifically by Country and then by Department, to aggregate the total salary first by country and subsequently by department.
SQL Query:
Select Country, Department, SUM(Salary) as TotalSalary
from Employee
group by Country, Department
Linq Query:
var results = from r in Employees
group r by new { r.Department, r.Country } into gp
select new
{
DepartmentId = gp.Key.Department,
Country = gp.Key.Country,
TotalSalary = gp.Sum(a => a.Salary)
};
foreach (var r in results)
{
Debug.Print(($@"""{r.DepartmentId}"" ""{r.Country}"" ""{r.TotalSalary}"""));
}
Output:
Country | Department | TotalSalary |
---|---|---|
France | 1 | 8500 |
UK | 1 | 5900 |
USA | 1 | 3500 |
France | 2 | 2500 |
UK | 2 | 2400 |
USA | 2 | 2800 |
France | 3 | 3200 |
UK | 3 | 5100 |
USA | 3 | 4800 |
- `DepartmentId`: This is assigned the department from the key of the group.
- `Country`: This is assigned the country from the key of the group.
- `TotalSalary`: This calculates the sum of the salaries (`a.Salary`) within each group using the `Sum` method.
Above LINQ query groups the employee records by both department and country, then computes the total salary for each combination of department and country, returning the results as a collection of anonymous objects with department ID, country, and total salary.
Use Multiple aggregate functions with order by
Let's illustrate this with an example. We're displaying the total salary by country and department, and suppose we also want to determine the total number of employees.
To find the total number of employees, we can utilize the `Count()` function. For instance, if we examine the Employee table and find 13 rows, we can retrieve the total count by using `Employee.Count()`, resulting in a count of 13. Additionally, we will utilize the `ORDER BY` clause in conjunction with the `GROUP BY` clause.
Sql Query:
Select Country, Department, SUM(Salary) as TotalSalary,COUNT(Name) as TotalEmployees
from Employee
group by Country, Department
order by TotalSalary
Linq Query:
var results = (from r in employees
group r by new { r.Department, r.Country } into gp
select new
{
DepartmentId = gp.Key.Department,
Country = gp.Key.Country,
TotalSalary = gp.Sum(item => item.Salary),
TotalEmployees = gp.Count()
}).OrderBy(a=>a.TotalEmployees);
foreach (var r in results)
{
Debug.Print(($@"""{r.Country}"" ""{r.DepartmentId}"" ""{r.TotalSalary}"" ""{r.TotalEmployees}"""));
}
Output
Country | Department | TotalSalary | TotalEmployees |
---|---|---|---|
UK | 2 | 2400 | 1 |
France | 2 | 2500 | 1 |
USA | 2 | 2800 | 1 |
France | 3 | 3200 | 1 |
USA | 1 | 3500 | 1 |
USA | 3 | 4800 | 2 |
UK | 3 | 5100 | 2 |
UK | 1 | 5900 | 2 |
France | 1 | 8500 | 2 |
- from r in employees: This specifies the source of the data, which is the employees collection.
- group r by new { r.Department, r.Country } into gp: This groups the employee records (r) based on a composite key consisting of both department and country. The into gp part creates a new group variable named gp to store the grouped results.
- select new { DepartmentId = gp.Key.Department, Country = gp.Key.Country, TotalSalary = gp.Sum(item => item.Salary), TotalEmployees = gp.Count() }: This part of the query selects the result for each group. It creates a new anonymous object with four properties:
- DepartmentId: This is assigned the department from the key of the group.
- Country: This is assigned the country from the key of the group.
- TotalSalary: This calculates the sum of the salaries (item.Salary) within each group using the Sum method.
- TotalEmployees: This calculates the total number of employees within each group using the Count method.
- .OrderBy(a=>a.TotalEmployees): This part of the query orders the results by the total number of employees in ascending order.
How to apply the where clause with Group?
In the above scenario, we aim to filter the results obtained from the `GROUP BY` operation. Suppose we only want to consider the salaries of employees in the Research and Development department. To accomplish this filtering, we can utilize the `WHERE` clause, specifying that the department is equal to 2.
Sql Query:
Select Country, Department, SUM(Salary) as TotalSalary,COUNT(Name) as TotalEmployees
from Employee
where Department=2
group by Country, Department
order by TotalSalary
Linq Query:
var results = (from r in employees
where r.Department==2
group r by new { r.Department, r.Country } into gp
select new
{
DepartmentId = gp.Key.Department,
Country = gp.Key.Country,
TotalSalary = gp.Sum(item => item.Salary),
TotalEmployees = gp.Count()
}).OrderBy(a => a.TotalEmployees);
foreach (var r in results)
{
Debug.Print(($@"""{r.Country}"" ""{r.DepartmentId}"" ""{r.TotalSalary}"" ""{r.TotalEmployees}"""));
}
Output:
Country | Department | TotalSalary | TotalEmployees |
---|---|---|---|
UK | 2 | 2400 | 1 |
France | 2 | 2500 | 1 |
USA | 2 | 2800 | 1 |
Use Min and Max aggregate functions with Group By
Lets’ say, I want the min salary that I’m paying to employees by Department.
Sql Query:
Select Department, MIN(Salary) as TotalSalary
from Employee
Group by Department
Linq Query:
var results = (from r in employees
group r by r.Department into gp
select new
{
DepartmentId = gp.Key,
MinDepartmentSalary = gp.Min(item => item.Salary),
});
Output
Department | MinDepartmentSalary |
---|---|
1 | 2700 |
2 | 2400 |
3 | 2000 |
- from r in employees: This part starts the query by iterating over each element in the employees collection. Each element is referred to as r within the context of this query.
- group r by r.Department into gp: Here, the query groups the elements of the employees collection by the Department property. The result is a collection of groups, where each group (gp) represents a unique department and contains all employees belonging to that department. The gp.Key represents the unique department identifier for each group.
Lets’ say, I want the max salary that I’m paying to employees by Department.
Sql Query:
Select Department, Max(Salary) as MinDepartmentSalary
from Employee
Group by Department
Linq Query:
var results = (from r in employees
group r by r.Department into gp
select new
{
DepartmentId = gp.Key,
MaxDepartmentSalary = gp.Max(item => item.Salary),
});
foreach (var r in results)
{
Debug.Print(($@"""{r.DepartmentId}"" ""{r.MaxDepartmentSalary}"""));
}
Output
Department | MaxDepartmentSalary |
---|---|
1 | 5000 |
2 | 2800 |
3 | 3200 |
- from r in employees: This specifies the source of the data, which is the employees collection.
- group r by r.Department into gp: This groups the employee records (r) based on the department they belong to. The into gp part creates a new group variable named gp to store the grouped results.
- select new { DepartmentId = gp.Key, MaxDepartmentSalary = gp.Max(item => item.Salary) }: This part of the query selects the result for each group. It creates a new anonymous object with two properties:
- DepartmentId: This is assigned the department (represented by the key of the group).
- MaxDepartmentSalary: This calculates the maximum salary within each department using the Max method.
This LINQ query groups the employee records by department and calculates the maximum salary for each department, returning the results as a collection of anonymous objects with department ID and maximum department salary.