GROUP BY clause

The GROUP BY clause allows you to find the characteristics of groups of rows rather than individual rows.

When you specify a GROUP BY clause, SQL divides the selected rows into groups such that the rows of each group have matching values in one or more columns or expressions. Next, SQL processes each group to produce a single-row result for the group. You can specify one or more columns or expressions in the GROUP BY clause to group the rows. The items you specify in the SELECT statement are properties of each group of rows, not properties of individual rows in a table or view.

Without a GROUP BY clause, the application of SQL aggregate functions returns one row. When GROUP BY is used, the function is applied to each group, thereby returning as many rows as there are groups.

For example, the CORPDATA.EMPLOYEE table has several sets of rows, and each set consists of rows describing members of a specific department. To find the average salary of people in each department, you can issue:

SELECT WORKDEPT, DECIMAL (AVG(SALARY),5,0)
       FROM CORPDATA.EMPLOYEE
       GROUP BY WORKDEPT

The result is several rows, one for each department.

WORKDEPT AVG-SALARY
A00 40850
B01 41250
C01 29722
D11 25147
D21 25668
E01 40175
E11 21020
E21 24086
Notes:
  1. Grouping the rows does not mean ordering them. Grouping puts each selected row in a group, which SQL then processes to derive characteristics of the group. Ordering the rows puts all the rows in the results table in ascending or descending collating sequence. Depending on the implementation selected by the database manager, the resulting groups may appear to be ordered.
  2. If there are null values in the column you specify in the GROUP BY clause, a single-row result is produced for the data in the rows with null values.
  3. If the grouping occurs over character, or UCS-2 or UTF-16 graphic columns, the sort sequence in effect when the query is run is applied to the grouping.

When you use GROUP BY, you list the columns or expressions you want SQL to use to group the rows. For example, suppose you want a list of the number of people working on each major project described in the CORPDATA.PROJECT table. You can issue:

SELECT SUM(PRSTAFF), MAJPROJ
       FROM CORPDATA.PROJECT
       GROUP BY MAJPROJ

The result is a list of the company's current major projects and the number of people working on each project:

SUM(PRSTAFF) MAJPROJ
6 AD3100
5 AD3110
10 MA2100
8 MA2110
5 OP1000
4 OP2000
3 OP2010
32.5 ?

You can also specify that you want the rows grouped by more than one column or expression. For example, you can issue a select-statement to find the average salary for men and women in each department, using the CORPDATA.EMPLOYEE table. To do this, you can issue:

SELECT WORKDEPT, SEX, DECIMAL(AVG(SALARY),5,0) AS AVG_WAGES
       FROM CORPDATA.EMPLOYEE
       GROUP BY WORKDEPT, SEX

Results in:

WORKDEPT SEX AVG_WAGES
A00 F 49625
A00 M 35000
B01 M 41250
C01 F 29722
D11 F 25817
D11 M 24764
D21 F 26933
D21 M 24720
E01 M 40175
E11 F 22810
E11 M 16545
E21 F 25370
E21 M 23830

Because you did not include a WHERE clause in this example, SQL examines and process all rows in the CORPDATA.EMPLOYEE table. The rows are grouped first by department number and next (within each department) by sex before SQL derives the average SALARY value for each group.

Related concepts
Sort sequences and normalization in SQL
Related reference
ORDER BY clause