Previous Page
Next Page

7.2. The UNION Operation

In SQL Server , a binary union is performed with the UNION set operation. A UNION takes the result sets from two (or more) queries and returns all rows from the results sets as a single result set (removing the duplicates). In this section, we illustrate how a UNION works; although there are other ways to retrieve this information, we are showing the UNION alternative.

Suppose that we want to find the names of all students who are computer science (COSC) majors, along with all students who are MATH majors from the Student table, we may write the following query that uses the UNION set operator:

    SELECT sname
    FROM Student
    WHERE major = 'COSC'
      UNION
    SELECT sname
    FROM Student
    WHERE major = 'MATH'

The two sets being unioned must have the same number of columns in the result sets of the SELECT clauses.


While executing the UNION, SQL first executes the first part of the query:

    SELECT sname
    FROM Student
    WHERE major = 'COSC'

This part virtually produces the following 10 rows of output:

    sname
    --------------------
    Mary
    Zelda
    Brenda
    Lujack
    Elainie
    Jake
    Hillary
    Brad
    Alan
    Jerry

    (10 row(s) affected)

Then SQL executes the second part of the query:

    SELECT sname
    FROM Student
    WHERE major = 'MATH'

This part virtually produces the following 7 rows of output:

    sname
    --------------------
    Mario
    Kelly
    Reva
    Monica
    Sadie
    Stephanie
    Jake

    (7 row(s) affected)

SQL then combines the two virtual sets of results (the UNION operation), which includes throwing out any duplicates (an extra "Jake," in this case), leaving us with the following 16 rows of output:

    sname
    --------------------
    Alan
    Brad
    Brenda
    Elainie
    Hillary
    Jake
    Jerry
    Kelly
    Lujack
    Mario
    Mary
    Monica
    Reva
    Sadie
    Stephanie
    Zelda

    (16 row(s) affected)

Prior to SQL Server 7, SQL Server always returned the result of a UNION in sorted order. This was so because the UNION eliminated duplicate rows using a sorting strategy. The ordering was simply a by-product of the sorting to eliminate duplicates. Newer versions of SQL Server, however, have several alternative strategies available for removing duplicates, so there is no guarantee of any particular order when you use UNION. If you would like to order the output, you should explicitly use ORDER BY at the end of your last SELECT statement.

The maximum number of rows possible when a UNION is used is the sum of the number of rows in the two result sets (or tables) in the two SELECT clauses.


7.2.1. Similar Columns in Unions

Earlier, we mentioned that for a union to be successful, there has to be union compatibility, and the two sets being unioned have to have similar columns. So what does similar columns mean?

If we wrote the earlier UNION example like this:

    SELECT major
    FROM Student
    WHERE major = 'COSC'
      UNION
    SELECT sname
    FROM Student
    WHERE major = 'MATH'

We would get a result set, but would the result set (output) be valid? The answer is no. You are trying to union majors and student names. These are not similar columns (though the data types of the two columns are compatible), and it does not make sense to union two different types of columns. So, before performing a union operation, you have to be very careful that you union like columns, and not "apples and oranges."

7.2.2. Unioning Constants or Variables

In SQL Server , a group of SELECT statements can also be used to union constants or variables:

    SELECT col1=100, col2=200
    UNION
    SELECT col1=400, col2=500
    UNION
    SELECT col1=100*3, col2=200*3
    UNION
    SELECT 900, 400

This query will produce:

    col1        col2
    ----------- -----------
    100         200
    300         600
    400         500
    900         400

    (4 row(s) affected)

Note that the output here happens to be sorted by the first column.


Previous Page
Next Page