7.7. The Union and the JoinIn Chapter 4, we discussed joins. In this section, we discuss some differences between the two operations, the UNION and the JOIN. Although the UNION operation and the JOIN operation are similar in that they both combine two tables or sets of data, the approaches used by the two operations are different. We will first present an example of when a JOIN may be used versus when a UNION may be used, and then we will present some other differences between the UNION and the JOIN. 7.7.1. When a JOIN May Be Used Versus When a UNION May Be UsedA JOIN is very commonly used in queries. As we discussed previously (in Chapter 4), JOINs (specifically, equi-joins) involve a result set created based on tables where the tables are linked via some common column. The UNION operator is mostly used to combine two sets of information where the genesis of the information is not as straightforward as in a join. Consider the following two examples. 7.7.1.1. Example 1: A straightforward join operationSuppose that we wanted to find the names of students who took accounting courses. This is a straightforward join example. This type of query would involve joining the Student, Section, and Course tables and selecting the student names from the result set. In this case though, we actually have to join the Student table to the Grade_report table first, and then join that result to the Section table, because we cannot directly join the Student table to the Section table. Then, we join that combined result to the Course tableso this ends up becoming a four-table join, with the Grade_report table acting like a bridge between Student and Section. The JOIN query would be:
SELECT DISTINCT(sname)
FROM Course c JOIN (Section se JOIN
(Student s JOIN Grade_report g
ON s.stno = g.student_number)
ON se.section_id = g.section_id)
ON c.course_number = se.course_num
AND c.course_name LIKE 'ACC%'
This query would give the following 20 rows of output:
sname
---------------
Alan
Bill
Brad
Brenda
Cedric
Chris
Donald
Hillary
Holly
Jessica
Kelly
Ken
Mario
Monica
Phoebe
Romona
Sadie
Steve
Susan
Zelda
(20 row(s) affected)
Note that we had to use a DISTINCT in the previous query, as the result of a JOIN gives duplicates. This example query could also be answered using subqueries, which are discussed later, but the point is that it is easy to see the relationship between the three (actually four) tables. 7.7.1.2. Example 2: A not-so-straightforward querySuppose that we wanted to find something like the names of the students who take accounting courses and combine them with the names of students who also major in subjects that use overhead projectors in the courses they take. This could be done using a join with this database, but it would involve finding a join-path through most of the database. For a much larger database, it might be very impractical to consider such a large join. It would be easier to first find the set of names of students who take accounting courses (call this set A) and then find students who major in subjects that use projectors (set B), then union sets A and B. The UNION approach allows us to simplify the problem and check intermediate results, so we will present this problem using a UNION. Further, each part of the problem can be done with joins or subqueries as needed for efficiency and then the results finally unioned. Set operations allow us to create sets of results any way we can and then combine the result sets using set operations; UNION is a set operation. Following, we present the UNION approach to doing this query. The first step is to do the parts individually. That is, first find the set of names of students who take accounting courses (this is the first half of the query before the UNION). Once this is done, then do the second part individually; that is, find the students who major in subjects that use projectors. Once you have the result for both parts, UNION the two results. We will not need the DISTINCT here, as UNION does not keep the duplicates. Here is a query that shows this approach:
SELECT sname
FROM Course c JOIN (Section se JOIN
(Student s JOIN Grade_report g
ON s.stno = g.student_number)
ON se.section_id = g.section_id)
ON c.course_number = se.course_num
AND c.course_name LIKE 'ACC%'
UNION
SELECT sname
FROM Student s JOIN
(Department_to_major d
JOIN (Course c JOIN
(Room r JOIN Section se
ON r.room = se.room)
ON se.course_num = c.course_number)
ON c.offering_dept = d.dcode)
ON s.major = d.dcode
AND r.ohead = 'Y'
This query produces 30 rows:
sname
--------------------
Alan
Bill
Brad
Brenda
Cedric
Chris
Cramer
Donald
Elainie
Hillary
Holly
Jake
Jerry
Jessica
Kelly
Ken
Lineas
Lujack
Mario
Mary
Monica
Phoebe
Rachel
Richard
Romona
Sadie
Smithly
Steve
Susan
Zelda
(30 row(s) affected)
7.7.2. A Summary of the Other Differences Between the UNION and the JOINIn this section, we summarize our JOIN/UNION discussion with three abstract tables containing three rows each of symbolic data. Relations or tables are sets of rows. We will first show the union. Assume that we have the following two tables. Table A
Table B
A SQL UNION can be shown would be:
SELECT * FROM TableA
UNION
SELECT * FROM TableB
which produces the following table as a result: Table C
Using a similar set of diagrams, the join operation could be shown as follows with the following two tables (joining TableA and TableD into TableE): Table A
Table D
Now, a SQL JOIN would be:
SELECT *
FROM TableA a JOIN TableD d
ON a.ColumnA = d.ColumnA
Giving the following table: Table E
Following are the major differences between UNIONs and JOINs:
|