Previous Page
Next Page

7.5. The IN and NOT..IN Predicates

Although SQL Server does not have the MINUS (difference) operator, it does have an IN predicate and its negation, the NOT..IN, which enables us to create differences. Let us look at this predicate from a set point of view. If we find the objects from set A that are not in set B, we have found the difference of set A and B (A - B).

7.5.1. Using IN

The following is a simple example of an IN predicate with constants in a SELECT statement:

    SELECT  sname, class
    FROM    Student
    WHERE   class IN (3,4)

In this example, IN (3,4) is called a subquery-set, where (3, 4) is the set in which we are testing membership. This query says: "Find all student names from the Student table where the class is in the set (3, 4)." It produces the following 17 rows of output:

    sname                class
    -------------------- ------
    Mary                 4
    Kelly                4
    Donald               4
    Chris                4
    Jake                 4
    Susan                3
    Monica               3
    Phoebe               3
    Holly                4
    Rachel               3
    Jerry                4
    Cramer               3
    Harrison             4
    Francis              4
    Losmith              3
    Gus                  3
    Benny                4

    (17 row(s) affected)

The preceding query produces the same output as the following query:

    SELECT  sname, class
    FROM    Student
    WHERE   class = 3 OR class = 4

In other words, the IN(3,4) means belonging to either set (3) OR set (4), as shown by the WHERE class = 3 OR class = 4.

7.5.1.1. Using IN as a subquery

We can expand the IN predicate's subquery-set part to be an actual query. For example, consider the following query:

    SELECT Student.sname
    FROM   Student
    WHERE  Student.stno IN
      (SELECT  g.student_number
       FROM    Grade_report g
       WHERE   g.grade = 'A')

Subqueries will be discussed at length in the next chapter.


Note the following about this query:

  • WHERE Student.stno references the name of the column in the Student table.

  • g.student_number is the column name in the Grade_report table.

  • stno in the Student table and student_number in the Grade_report table have the same domain.

Note also that you must retrieve the information from the same domains for purposes of union compatibility.

The preceding query produces the following 14 rows of output:

    sname
    --------------------
    Lineas
    Mary
    Brenda
    Richard
    Lujack
    Donald
    Lynette
    Susan
    Holly
    Sadie
    Jessica
    Steve
    Cedric
    Jerry

    (14 row(s) affected)

You could view the preceding query as a result derived from the intersection of the sets A and B, where set A is the set of student numbers in the student set (from the Student table) and set B is the set of student numbers in the grade set (from the Grade_report table) that have As.

To make this command behave like a set operator (as if it were an INTERSECT operator), you can add the qualifier DISTINCT to the result set as follows:

    SELECT DISTINCT (Student.sname)
    FROM  Student
    WHERE Student.stno IN
      (SELECT DISTINCT (g.student_number)
       FROM Grade_report g
       WHERE g.grade = 'A')

This query produces the following 14 rows of output:

    sname
    --------------------
    Brenda
    Cedric
    Donald
    Holly
    Jerry
    Jessica
    Lineas
    Lujack
    Lynette
    Mary
    Richard
    Sadie
    Steve
    Susan

    (14 row(s) affected)

Here, SQL Server sorts the results for you and does not return duplicates.

7.5.2. The INTERSECT Operator

From a set point of view, an INTERSECT means if we find objects from set A that are also in set B (and vice versa), we have found the intersection of sets A and B. SQL Server has an INTERSECT operator.

The following query is the previous query written using an INTERSECT (but we displayed student numbers instead of student names):

    SELECT s.stno
    FROM Student s
    INTERSECT
    SELECT g.student_number
    FROM Grade_report g
    WHERE g.grade = 'A'

This query gives the following 14 rows of output:

    stno
    -------------
    2
    3
    8
    10
    14
    20
    34
    49
    123
    125
    126
    127
    129
    142

    (14 row(s) affected)

In this query, we had to display student numbers (stno) instead of the student names (sname) because of the set compatibility issue discussed earlier. INTERSECT is a set operator, so the two sets being intersected have to have the same number of columns and the columns have to have compatible data types.

Another example of the use of the INTERSECT operator would be, for example, if we wanted to find all the students who had dependents, in which case we could type:

    SELECT s.stno
    FROM Student s
    INTERSECT
    SELECT d.pno
    FROM Dependent d

This query would give the following 19 rows of output:

    stno
    ------------------
    2
    10
    14
    17
    20
    34
    62
    123
    126
    128
    132
    142
    143
    144
    145
    146
    147
    153
    158

    (19 row(s) affected)

Though the INTERSECT operator gives us the right answer, in some ways the IN as a subquery (discussed earlier) is better to use, because when SQL Server performs the INTERSECT, it selects sets based on what is mentioned in the SELECT statements. So, for example, if we wanted the student names in addition to the student numbers, and we typed:

    SELECT s.stno, s.sname
    FROM Student s
    INTERSECT
    SELECT d.pno, relationship
    FROM Dependent d

The query would not work.

Here we would have to use an IN with a subquery as discussed earlier:

    SELECT s.stno, s.sname
    FROM Student AS s
    WHERE (s.stno IN
    (SELECT pno
    FROM Dependent AS d))

giving us the following 19 rows of output:

    stno    sname
    ----    ------------------
    2       Lineas
    10      Richard
    14      Lujack
    17      Elainie
    20      Donald
    34      Lynette
    62      Monica
    123     Holly
    126     Jessica
    128     Brad
    132     George
    142     Jerry
    143     Cramer
    144     Fraiser
    145     Harrison
    146     Francis
    147     Smithly
    153     Genevieve
    158     Thornton

    (19 row(s) affected)

7.5.3. Using NOT..IN

The NOT..IN is really a negated IN predicate. If you use the NOT..IN in your query, your query may perform poorly. The reason is that when NOT..IN is used, no indexing can be used, because the NOT..IN part of the query has to test the set with all values to find out what is not in the set. For smaller tables, no difference in performance will likely be detected. Nonetheless, we discuss how to use NOT..IN in this section, to demonstrate the logical negative of the IN predicate, which will help to complete your overall understanding of the SQL language. Instead of using NOT..IN, it is often preferable to use NOT EXISTS or outer join techniques, both of which are discussed later on.

Indexing is discussed in detail in Chapter 11.


Sometimes the NOT..IN may seem to more easily describe the desired outcome or may be used for a set difference. For a simple example, consider the following query:

    SELECT sname, class
    FROM   Student
    WHERE  class IN (1,3,4)

This query produces the following 28 rows of output:

    sname                class
    -------------------- ------
    Lineas               1
    Mary                 4
    Richard              1
    Kelly                4
    Lujack               1
    Elainie              1
    Donald               4
    Chris                4
    Jake                 4
    Lynette              1
    Susan                3
    Monica               3
    Hillary              1
    Phoebe               3
    Holly                4
    Steve                1
    Brad                 1
    Rachel               3
    George               1
    Jerry                4
    Cramer               3
    Fraiser              1
    Harrison             4
    Francis              4
    Losmith              3
    Lindsay              1
    Gus                  3
    Benny                4

    (28 row(s) affected)

Contrast the preceding query to the following query:

    SELECT sname, class
    FROM   Student
    WHERE  class NOT IN (2)

The output in this case is the same as the preceding output because the Student table only has classes 1, 2, 3, and 4. If counts (results) did not "add up," this would show that some value of class was not 1, 2, 3, or 4.

As another example, suppose that you want the names of students who are not computer science (COSC) or math (MATH) majors. The query would be:

    SELECT sname, major
    FROM   Student
    WHERE  major NOT IN ('COSC','MATH')

which produces the following output (28 rows):

    sname                major
    -------------------- -----
    Lineas               ENGL
    Ken                  POLY
    Romona               ENGL
    Richard              ENGL
    Harley               POLY
    Donald               ACCT
    Chris                ACCT
    Lynette              POLY
    Susan                ENGL
    Bill                 POLY
    Phoebe               ENGL
    Holly                POLY
    Jessica              POLY
    Steve                ENGL
    Cedric               ENGL
    Rachel               ENGL
    George               POLY
    Cramer               ENGL
    Fraiser              POLY
    Harrison             ACCT
    Francis              ACCT
    Smithly              ENGL
    Sebastian            ACCT
    Losmith              CHEM
    Genevieve            UNKN
    Lindsay              UNKN
    Gus                  ART
    Benny                CHEM

    (28 row(s) affected)

The example output gave all majors other than COSC and MATH. But you must be very careful with the NOT..IN predicate, because if nulls are present in the data, you may get odd answers with NOT..IN.

As an example, consider the following table called Stumajor:

    name                 major
    -------------------- --------------------
    Mary                 Biology
    Sam                  Chemistry
    Alice                Art
    Tom                  NULL

    (4 row(s) affected)

The table Stumajor has not been created for you in the Student_course database. You have to create it, insert the records shown, and then run the queries that follow.


If you perform the following query:

    SELECT *
    FROM Stumajor
    WHERE major IN ('Chemistry','Biology')

It produces the following output:

    name                 major
    -------------------- --------------------
    Mary                 Biology
    Sam                  Chemistry

    (2 row(s) affected)

If you perform the following query:

    SELECT *
    FROM Stumajor
    WHERE major NOT IN ('Chemistry','Biology')

It produces the following output:

    name                 major
    -------------------- --------------------
    Alice                Art

    (1 row(s) affected)

The value, null, is not equal to anything. You might expect that NOT..IN would give you <Tom,null>, but it does not. Why? Because nulls in the selection column (here, major) are not matched with a NOT..IN.

7.5.3.1. Using NOT..IN in a subquery

A NOT..IN can also be used in a subquery. For example, assume that we have another table called Instructor, as shown here:

    iname                teaches
    -------------------- --------------------
    Richard              COSC
    Subhash              MATH
    Tapan                BIOCHEM

    (3 row(s) affected)

The Instructor table has not been created for you in the Student_course database. You have to create it, insert the records shown, and then run the queries that follow.


Now, if we want to find all the departments that do not have instructors, we could type the following query:

    SELECT *
    FROM department_to_major d
    WHERE d.dcode NOT IN
      (SELECT dcode
      FROM department_to_major d, instructor i
      WHERE d.dcode=i.teaches)

This query produces the following output (6 rows):

    Dcode DNAME
    ----- --------------------
    ACCT  Accounting
    ART   Art
    CHEM  Chemistry
    ENGL  English
    POLY  Political Science
    UNKN  NULL

    (6 row(s) affected)

Note that in this case, the NOT..IN "behaved" correctly and reported the NULL value for DNAME!


Previous Page
Next Page