Previous Page
Next Page

7.7. The Union and the Join

In Chapter 4, we discussed joins. In this section, we discuss some differences between the two operations, the UNION and the JOIN. Although the UNION operation and the JOIN operation are similar in that they both combine two tables or sets of data, the approaches used by the two operations are different. We will first present an example of when a JOIN may be used versus when a UNION may be used, and then we will present some other differences between the UNION and the JOIN.

7.7.1. When a JOIN May Be Used Versus When a UNION May Be Used

A JOIN is very commonly used in queries. As we discussed previously (in Chapter 4), JOINs (specifically, equi-joins) involve a result set created based on tables where the tables are linked via some common column. The UNION operator is mostly used to combine two sets of information where the genesis of the information is not as straightforward as in a join. Consider the following two examples.

7.7.1.1. Example 1: A straightforward join operation

Suppose that we wanted to find the names of students who took accounting courses. This is a straightforward join example. This type of query would involve joining the Student, Section, and Course tables and selecting the student names from the result set. In this case though, we actually have to join the Student table to the Grade_report table first, and then join that result to the Section table, because we cannot directly join the Student table to the Section table. Then, we join that combined result to the Course tableso this ends up becoming a four-table join, with the Grade_report table acting like a bridge between Student and Section. The JOIN query would be:

    SELECT DISTINCT(sname)
    FROM Course c JOIN (Section se JOIN
    (Student s JOIN Grade_report g
    ON s.stno = g.student_number)
    ON se.section_id = g.section_id)
    ON c.course_number = se.course_num
    AND c.course_name LIKE 'ACC%'

This query would give the following 20 rows of output:

    sname
    ---------------
    Alan
    Bill
    Brad
    Brenda
    Cedric
    Chris
    Donald
    Hillary
    Holly
    Jessica
    Kelly
    Ken
    Mario
    Monica
    Phoebe
    Romona
    Sadie
    Steve
    Susan
    Zelda

    (20 row(s) affected)

Note that we had to use a DISTINCT in the previous query, as the result of a JOIN gives duplicates.

This example query could also be answered using subqueries, which are discussed later, but the point is that it is easy to see the relationship between the three (actually four) tables.

7.7.1.2. Example 2: A not-so-straightforward query

Suppose that we wanted to find something like the names of the students who take accounting courses and combine them with the names of students who also major in subjects that use overhead projectors in the courses they take. This could be done using a join with this database, but it would involve finding a join-path through most of the database. For a much larger database, it might be very impractical to consider such a large join. It would be easier to first find the set of names of students who take accounting courses (call this set A) and then find students who major in subjects that use projectors (set B), then union sets A and B. The UNION approach allows us to simplify the problem and check intermediate results, so we will present this problem using a UNION. Further, each part of the problem can be done with joins or subqueries as needed for efficiency and then the results finally unioned. Set operations allow us to create sets of results any way we can and then combine the result sets using set operations; UNION is a set operation.

Following, we present the UNION approach to doing this query. The first step is to do the parts individually. That is, first find the set of names of students who take accounting courses (this is the first half of the query before the UNION). Once this is done, then do the second part individually; that is, find the students who major in subjects that use projectors. Once you have the result for both parts, UNION the two results. We will not need the DISTINCT here, as UNION does not keep the duplicates. Here is a query that shows this approach:

    SELECT sname
    FROM Course c JOIN (Section se JOIN
    (Student s JOIN Grade_report g
    ON s.stno = g.student_number)
    ON se.section_id = g.section_id)
    ON c.course_number = se.course_num
    AND c.course_name LIKE 'ACC%'
    UNION
    SELECT sname
    FROM Student s JOIN
    (Department_to_major d
    JOIN (Course c JOIN
    (Room r JOIN Section se
    ON r.room = se.room)
    ON se.course_num = c.course_number)
    ON c.offering_dept = d.dcode)
    ON s.major = d.dcode
    AND r.ohead = 'Y'

This query produces 30 rows:

    sname
    --------------------
    Alan
    Bill
    Brad
    Brenda
    Cedric
    Chris
    Cramer
    Donald
    Elainie
    Hillary
    Holly
    Jake
    Jerry
    Jessica
    Kelly
    Ken
    Lineas
    Lujack
    Mario
    Mary
    Monica
    Phoebe
    Rachel
    Richard
    Romona
    Sadie
    Smithly
    Steve
    Susan
    Zelda

    (30 row(s) affected)

7.7.2. A Summary of the Other Differences Between the UNION and the JOIN

In this section, we summarize our JOIN/UNION discussion with three abstract tables containing three rows each of symbolic data. Relations or tables are sets of rows.

We will first show the union. Assume that we have the following two tables.

Table A

ColumnA

ColumnB

ColumnC

X1

Y1

Z1

X2

Y2

Z2

X3

Y3

Z3


Table B

ColumnA

ColumnB

ColumnC

X4

Y4

Z4

X5

Y5

Z5

X6

Y6

Z6


A SQL UNION can be shown would be:

    SELECT * FROM TableA
    UNION
    SELECT * FROM TableB

which produces the following table as a result:

Table C

ColumnA

ColumnB

ColumnC

X1

Y1

Z1

X2

Y2

Z2

X3

Y3

Z3

X4

Y4

Z4

X5

Y5

Z5

X6

Y6

Z6


Using a similar set of diagrams, the join operation could be shown as follows with the following two tables (joining TableA and TableD into TableE):

Table A

ColumnA

ColumnB

ColumnC

X1

Y1

Z1

X2

Y2

Z2

X3

Y3

Z3


Table D

ColumnA

ColumnD

ColumnE

X1

D1

E1

X2

D2

E2

X3

D3

E3


Now, a SQL JOIN would be:

    SELECT *
    FROM TableA a JOIN TableD d
    ON a.ColumnA = d.ColumnA

Giving the following table:

Table E

TableA.ColumnA

TableA.ColumnB

TableA.ColumnC

TableB.ColumnA

TableB.ColumnD

TableB.ColumnE

X1

Y1

Z1

X1

D1

E1

X2

Y2

Z2

X2

D2

E2

X3

Y3

Z3

X3

D3

E3


Following are the major differences between UNIONs and JOINs:

  • In a UNION, all the rows in the resulting tables (sets) being unioned have to be compatible; in a JOIN, only the joining columns of the tables being joined have to be compatiblethe other columns may be different.

  • In a UNION, no "new" columns can be added to the final result of the UNION; in a JOIN, new columns can be added to the result of the JOIN.

  • In a UNION, the number of columns in the result set has to be the same as the number of columns in the sets being unioned; in a JOIN, the number of columns in the result set may vary.


Previous Page
Next Page