9.5. Auditing in SubqueriesIn this section, we consider a potential problem of using aggregation with subqueries. As with Cartesian products and joins, aggregation hides details and should always be audited. The two tables that follow will be used to illustrate this problem. Table 9-2 is similar to the Grade_report table and contains a student section identifier (ssec), grades (gd), and student names (sname).
Table 9-3 is similar to the Section table and contains a section identifier (sec) and an instructor name (iname).
Now suppose that you want to find out how many As each instructor awarded. You might start with a join of the GG and SS tables. A normal equi-join would be as follows: SELECT * FROM GG, SS WHERE GG.ssec = SS.sec This query would produce the following output (nine rows): ssec gd sname sec iname ----------- ---- ------------ ----------- ------------ 100 A Brenda 100 Jones 110 B Brenda 110 Smith 120 A Brenda 120 Jones 200 A Brenda 200 Adams 210 A Brenda 210 Jones 100 A Richard 100 Jones 100 B Doug 100 Jones 200 A Richard 200 Adams 110 B Morris 110 Smith (9 row(s) affected) In addition, the following query tells you that there are six As in the GG table: SELECT COUNT(*) AS [Count of As] FROM GG WHERE gd = 'A' giving: Count of As ------------ 6 (1 row(s) affected) Now, if you want to find out which instructor gave the As, you would type this query: SELECT SS.iname FROM SS, GG WHERE SS.sec = GG.ssec AND GG.gd = 'A' You get the following six rows of output: iname ------------ Jones Jones Adams Jones Jones Adams (6 row(s) affected) Now, to find "how many" As each instructor gave, include a COUNT and GROUP BY as follows: SELECT SS.iname AS [iname], COUNT(*) AS [count] FROM SS, GG WHERE SS.sec = GG.ssec AND GG.gd = 'A' GROUP BY SS.iname This query produces the following output: iname count ------------ ----------- Adams 2 Jones 4 (2 row(s) affected) This shows that instructor Adams gave two As and instructor Jones gave four As. So far, so good. You should note that the final count/grouping has the same number of As as the original tablesthe sum of the counts equals 6. Now, if you had devised a COUNT query with a sub-SELECT, you could get an answer that looked correct but in fact was not. For example, consider the following subquery version of the preceding join query: SELECT SS.iname AS [iname], COUNT(*) AS [count] FROM SS WHERE SS.sec IN (SELECT GG.ssec FROM GG WHERE GG.gd = 'A') GROUP BY SS.iname This query produces the following output: iname count ------------ ----------- Adams 1 Jones 3 (2 row(s) affected) The reason that you get this output is that the second query is counting names of instructors and whether an A is present in the set of courses that this instructor teachesnot how many As are in the set, just whether any exist. The previous join query gives you all the As in the joined table and hence gives the correct answer to the question "How many As did each instructor award?" The sub-SELECTed query answers a different question: "In how many sections did the instructor award an A?" The point in this example is that if you are SELECTing and COUNTing, it is a very good idea to audit your results often. If you want to COUNT the number of As by instructor, begin by first counting how many As there are. Then, you can construct a query to join and count. You should be able to total and reconcile the number of As to the number of As by instructor. The fact that the result makes sense is very useful in determining (albeit not proving) correctness. ![]() |