9.3. The HAVING Clause
The GROUP BY and HAVING clauses are used together. The HAVING clause is used as a final filter (rather than as a conditional filter) on the aggregate column values in the result set of a SELECT statement. In other words, the query has to be grouped before the HAVING clause can be applied. For example, consider the following statement, which displays the count of students in various classes (classes of students = 1, 2, 3, 4, corresponding to freshman, sophomore, and so on):
SELECT class, COUNT(*) AS [count]
FROM Student
GROUP BY class
This query produces the following output:
class count
----- -----------
NULL 10
1 11
2 10
3 7
4 10
(5 row(s) affected)
If you are interested only in classes that have more than a certain number of students in them, you could use the following statement:
SELECT class, COUNT(*) AS [count]
FROM Student
GROUP BY class
HAVING COUNT(*) > 9
which produces the following four rows of output:
class count
----- -----------
NULL 10
1 11
2 10
4 10
(4 row(s) affected)
9.3.1. HAVING and WHERE
Whereas HAVING is a final filter in a SELECT statement, the WHERE clause, which excludes rows from a result set, is a conditional filter. HAVING is used to filter based on aggregate values, WHERE cannot do that. Consider the following two queries:
SELECT class, COUNT(*) AS [count]
FROM Student
GROUP BY class
HAVING class = 3
SELECT class, COUNT(*) AS [count]
FROM Student
WHERE class = 3
GROUP BY class
Both queries produce the following output:
class count
----- -----------
3 7
(1 row(s) affected)
In a typical implementation, the first of these two queries is less efficient because the query engine has to complete the query before removing rows WHERE class = 3 from the result. In the second version, the rows WHERE class = 3 are removed before the grouping takes place. WHERE is not always a substitute for HAVING, but when it can be used instead of HAVING, it should be. Notice that in the example:
SELECT class, COUNT(*) AS [count]
FROM Student
GROUP BY class
HAVING COUNT(*) > 9
HAVING and WHERE are not interchangeable because the grouping has to take place before the HAVING could have an effect. You cannot know in advance what the counts for each class are until they are counted.
Consider the following query, its meaning, and the processing that is required to finalize the result set:
SELECT class, major, COUNT(*) AS [count]
FROM Student
WHERE major = 'COSC'
GROUP BY class, major
HAVING COUNT(*) > 2
This query produces the following output:
class major count
----- ----- -----------
1 COSC 4
4 COSC 3
(2 row(s) affected)
In this example, all computer science (COSC) majors (per the WHERE clause) will be grouped and COUNTed and then displayed only if COUNT(*) > 2. The query might erroneously be interpreted as "Group and count all COSC majors by class, but only if there are more than two in a class." This interpretation is wrong, because SQL applies the WHERE, then applies the GROUP BY, and, finally, filters with the HAVING criterion.
|