Previous Page
Next Page

9.4. GROUP BY and HAVING: Aggregates of Aggregates

A "usual" GROUP BY has an aggregate and a column that are grouped like this:

SELECT      COUNT(stno) AS [count of student no], class
FROM        Student
GROUP BY class

This produces a result set of 5 rows of counts by class:

count of student no     class
-------------------     -----
10                      NULL
11                      1
10                      2
7                       3
10                      4
 
(5 row(s) affected)

Although you must have class or some other attribute in the GROUP BY, you do not have to have the class in the result set. Consider the following query, which generates the same numeric information as the previous query, but does not report the class in the result:

SELECT      COUNT(stno) AS [count of student no]
FROM        Student
GROUP BY class

This query produces the following five rows of output:

count of student no
-------------------
10
11
10
7
10
 
(5 row(s) affected)

This previous example may seem contradictory to the preceding discussion, but it is not. You must have all the non aggregate columns from the result set in the GROUP BY, but you do not have to have the columns in the result set that you are grouping. That example may prove useful when a grouped result is needed in a filter. For example, how would you find the class with the most students?

9.4.1. Aggregation and Grouping in SQL Server

SQL Server will not allow you to handle aggregation and grouping by nesting aggregates. For example, suppose you want to find the class with the minimum number of students. You might try the following query:

SELECT MIN(COUNT(stno))
FROM Student
GROUP BY class

Though it may seem logical, this query will not work in SQL Server . It will produce the following error message:

Msg 130, Level 15, State 1, Line 1
Cannot perform an aggregate function on an expression containing an aggregate or a
subquery.

The MIN function is an aggregate function, and aggregate functions operate on rows within tables. In this case, the query is asking MIN to operate on a table of counted classes that have not yet been calculated. The point is that SQL Server does not handle this mismatch of aggregation and grouping.

This mismatch of aggregation and grouping can be handled by other SQL languages, such as Oracle.


To handle this mismatch of aggregation and grouping in SQL Server , you can use derived structures such as temporary tables, inline views, or regular views (derived structures are covered in Chapter 6). Using either a temporary table or an inline view is the most logical way to solve this problem, so only these two choices are described here.

9.4.1.1. Aggregation and grouping handled with a global temporary table

This section shows how we can handle the mismatch of aggregation and grouping (described earlier) using a global temporary table.

The following steps describe how to use a global temporary table to find the class with the minimum number of students:

  1. Display the counts of classes, grouped by class:

    SELECT      COUNT(stno) AS [count of students]
    FROM        Student
    GROUP BY class
    

    This query produces the following five rows of output:

    count       class
    ----------- ------
    10          NULL
    11          1
    10          2
    7           3
    10          4
     
    (5 row(s) affected)
    

  2. To find the minimum number of students in a class, count the students (you could use stno for student number) grouped by class, and put this result in ##Temp1 (a global temporary table)--shown by the first query following, and then find the minimum number of students in a class from the global temporary table, ##Temp1, with SELECT MIN(count) AS [MINIMUM COUNT] FROM ##Temp1, and then use this information in a subquery with a HAVING clause as follows: First type the query:

    SELECT (COUNT([stno])) AS [count], class INTO ##Temp1
    FROM Student
    GROUP BY [class]
    

    After executing the previous query, type:

    SELECT COUNT(stno) AS [count of stno], class
    FROM Student
    GROUP BY class
    HAVING COUNT(stno) =
    (SELECT MIN(count) AS [Minimum count]
    FROM ##Temp1)
    

This query produces the desired output (the class with the minimum number of students):

count of stno     class
-------------     -----
7                 3
 
(1 row(s) affected)

9.4.1.2. Aggregation and grouping handled with an inline view

As described in Chapter 6, you can put a query in the FROM clause of a SELECT statement to create an inline view. An inline view exists only during the execution of a query.

The following steps describe how to use an inline view to find the class with the minimum number of students:

  1. Count the stno in the FROM clause of the SELECT statement as follows:

    SELECT "Min of Count" = MIN(c)
    FROM (SELECT c = COUNT(stno)
    FROM Student
    GROUP BY class) AS in_view
    

    Because SQL Server cannot directly find aggregates of aggregates, in the previous query, we give a name to the COUNT in the inline view, c, to temporarily store the aggregate result in the inline view, in_view. We then operate on the inline view as though it were a table and find the minimum value for c.

    The previous query produces the following output:

    Min of Count
    ------------
    7
     
    (1 row(s) affected)
    

  2. To find out which class has the minimum count, you can write the final query using the previous query as a subquery with a HAVING clause in the outer part of the final query, as follows:

    SELECT class, "Count of Class" = COUNT(*)
    FROM Student
    GROUP BY class
    HAVING COUNT(*) =
    (SELECT MIN(c)
    FROM (SELECT COUNT(stno) AS [c]
    FROM Student
    GROUP BY class) AS in_view)
    

This query produces the desired output:

class     Count of Class
-----     --------------
3         7
 
(1 row(s) affected)

So, although SQL Server does not handle a mismatch of aggregation and HAVING, you can use your knowledge of temporary tables and inline views to work around the problem. This problem may also be solved using regular views. It is also noteworthy to see the process of query development in that some problems require using small queries and building from them to a final result.

Once again, Chapter 6 covers the advantages and disadvantages of using each one of the derived structures.



Previous Page
Next Page