9.2. The GROUP BY ClauseGROUP BY is used in conjunction with aggregate functions to group data on the basis of the same values in a column. GROUP BY returns one row for each value of the column(s) that is grouped. You can use GROUP BY to group by one column or multiple columns. As an example of how to group by one column, the following statement shows how you can use the aggregate COUNT to extract the number of class groups (number of students in each class) from the Student table: SELECT class, COUNT(*) AS [count] FROM Student GROUP BY class This query produces the following five rows of output, which is grouped by one column, class: class count ----- ----------- NULL 10 1 11 2 10 3 7 4 10 (5 row(s) affected) This type of statement gives you a new way to retrieve and organize aggregate data. Other aggregate functions would have a similar syntax.
If a GROUP BY clause contains a two-column specification, the result is aggregated and grouped by two columns. For example, the following is COUNT of class and major from the Student table: SELECT class, major, COUNT(*) AS [count] FROM Student GROUP BY class, major This query produces the following output (24 rows), which is grouped by class within major: class major count ----- ----- ----------- NULL NULL 3 2 ACCT 1 4 ACCT 4 3 ART 1 3 CHEM 1 4 CHEM 1 NULL COSC 1 1 COSC 4 2 COSC 2 4 COSC 3 NULL ENGL 1 1 ENGL 3 2 ENGL 2 3 ENGL 4 NULL MATH 2 2 MATH 3 3 MATH 1 4 MATH 1 NULL POLY 2 1 POLY 3 2 POLY 2 4 POLY 1 NULL UNKN 1 1 UNKN 1 (24 row(s) affected) The sequence of the columns in a GROUP BY clause has the effect of ordering the output. If we change the order of the GROUP BY like this: SELECT class, major, COUNT(*) AS [count] FROM Student GROUP BY major, class our result will look like this: class major count ----- ----- ----------- NULL NULL 3 NULL COSC 1 NULL ENGL 1 NULL MATH 2 NULL POLY 2 NULL UNKN 1 1 COSC 4 1 ENGL 3 1 POLY 3 1 UNKN 1 2 ACCT 1 2 COSC 2 2 ENGL 2 2 MATH 3 2 POLY 2 3 ART 1 3 CHEM 1 3 ENGL 4 3 MATH 1 4 ACCT 4 4 CHEM 1 4 COSC 3 4 MATH 1 4 POLY 1 (24 row(s) affected) Here the output is grouped by major within class. A statement like the following will cause a syntax error, because it says that you are to count both class and major, but GROUP BY class only: SELECT class, major, COUNT(*) FROM Student GROUP BY class This query results in the following error message: Msg 8120, Level 16, State 1, Line 1 Column 'Student.MAJOR' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. To be syntactically and logically correct, you must have all the non aggregate columns of the result set in the GROUP BY clause. For example, let's take a look at the data of Table 9-1.
The following query would be improper, because you must GROUP BY "ohead" to SUM capacities for each ohead value: SELECT ohead, SUM(capacity) FROM Room
This query would produce an error message similar to what we saw previously: Msg 8120, Level 16, State 1, Line 1 Column 'Room.OHEAD' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. If you SELECT columns and use an aggregate function, you must GROUP BY the non aggregate attributes. The correct version of the last statement is as follows: SELECT ohead, SUM(capacity) AS [sum] FROM Room GROUP BY ohead which produces the following three rows of output: ohead sum ----- ----------- NULL 100 N 110 Y 142 Warning: Null value is eliminated by an aggregate or other SET operation. (3 row(s) affected) This is the sum of room capacities for rooms that have no overhead projectors (N), rooms that have overhead projectors (Y), and rooms in which the overhead projector capacity is unknown (null). Observe that in the Room table, some rooms have null values for ohead, and the null rows are summed and grouped along with the non-null rows. 9.2.1. GROUP BY and ORDER BYTo enhance the display of a GROUP BY clause, you can combine it with an ORDER BY clause. Consider the following example: SELECT class, major, COUNT(*) AS [count] FROM Student GROUP BY class, major The output for this query was presented earlier in the chapter. This result set can also be ordered by any other column from the result set using the ORDER BY. For instance, the following example orders the output in descending order by COUNT(*): SELECT class, major, COUNT(*) AS [count] FROM Student GROUP BY class, major ORDER BY COUNT(*) DESC This query produces the following output (24 rows): class major count ------ ----- ----------- 4 ACCT 4 1 COSC 4 3 ENGL 4 2 MATH 3 4 COSC 3 1 ENGL 3 NULL NULL 3 1 POLY 3 2 POLY 2 NULL POLY 2 2 COSC 2 2 ENGL 2 NULL MATH 2 3 MATH 1 4 MATH 1 NULL ENGL 1 2 ACCT 1 3 ART 1 3 CHEM 1 4 CHEM 1 NULL COSC 1 4 POLY 1 NULL UNKN 1 1 UNKN 1 (24 row(s) affected) 9.2.2. GROUP BY and DISTINCTWhen a SELECT clause includes all the columns specified in a GROUP BY clause, the use of the DISTINCT function is unnecessary and inefficient, because the GROUP BY clause groups rows in such a way that the column(s) that are grouped will not have duplicate values. ![]() |