Previous Page
Next Page

8.2. The Subquery as a Join

An alternative way to perform the preceding query would be to use a join instead of a subquery, as follows:

    SELECT  Student.sname, Student.stno
    FROM    Student, Grade_report gr
    WHERE   Student.stno = gr.student_number
    AND        (gr.grade = 'B' OR gr.grade = 'A')

This query produces 67 rows of output (of which we show the first 15 rows here):

    sname     stno
    --------- ------
    Lineas    2
    Lineas    2
    Lineas    2
    Lineas    2
    Mary      3
    Mary      3
    Mary      3
    Mary      3
    Mary      3
    Mary      3
    Brenda    8
    Brenda    8
    Brenda    8
    Richard   10
    Kelly     13
    .
    .
    .

    (67 row(s) affected)

Now, the question is why the join has 67 rows of output instead of 31 rows of output (produced by the subquery).

When the join version is used to combine tables, any Student-Grade_report row (tuple) that has equal student numbers and a grade of A or B is selected. Thus, you should expect many duplicate names in the output. To get the result without duplicates, add the qualifier DISTINCT to the join query as follows:

    SELECT DISTINCT Student.sname, Student.stno
    FROM    Student, Grade_report AS gr
    WHERE   Student.stno = gr.student_number
    AND       (gr.grade = 'B' OR gr.grade = 'A')

This query produces the following output (31 rows):

    sname                stno
    -------------------- ------
    Lineas               2
    Mary                 3
    Zelda                5
    Ken                  6
    Mario                7
    Brenda               8
    Richard              10
    Kelly                13
    Lujack               14
    Reva                 15
    Harley               19
    Donald               20
    Chris                24
    Lynette              34
    Susan                49
    Hillary              121
    Phoebe               122
    Holly                123
    Sadie                125
    Jessica              126
    Steve                127
    Cedric               129
    George               132
    Jerry                142
    Cramer               143
    Fraiser              144
    Francis              146
    Smithly              147
    Sebastian            148
    Lindsay              155
    Stephanie            157

    (31 row(s) affected)

When DISTINCT is used, internal sorting is performed before the result set is displayed. Such internal sorting may decrease response time for a query.

In the subquery version of the query, duplication of names does not occur in the output. This is so because you are setting up a set (the subquery) from which you will choose namesa given name is either in the subquery set or it is not. Remember that the student number (stno) is unique in the Student table.

Also, the question of which is more efficient, the join or the subquery, depends on which SQL language and database you are using. Without using extra tools, one way to test alternatives is to try the queries on the data or a subset of the data. Database systems such as SQL Server provide ways (tools) to find out how queries are executed.


Previous Page
Next Page