Previous Page
Next Page

4.2. The Cartesian Product

In a SQL statement, a Cartesian product is where every row of the first table in the FROM clause is joined with each and every row of the second table in the FROM clause. A Cartesian product is produced when the WHERE form of the JOIN is used without the WHERE. An example of a Cartesian product (join) would be:

SELECT *
FROM Course c, Prereq p

The preceding command combines all the data in both the tables and makes a new result set. All rows in the Course table are matched with all rows in the Prereq table (a Cartesian product). This produces 384 rows of output, of which we show the first 10 rows here:

COURSE_NAME          COURSE_NUMBER CREDIT_HOURS OFFERING_DEPT COURSE_NUMBER PREREQ
-------------------- ------------- ------------ ------------- ------------- -------
ACCOUNTING I         ACCT2020      3            ACCT          ACCT3333     ACCT2220
ACCOUNTING II        ACCT2220      3            ACCT          ACCT3333     ACCT2220
MANAGERIAL FINANCE   ACCT3333      3            ACCT          ACCT3333     ACCT2220
ACCOUNTING INFO SYST ACCT3464      3            ACCT          ACCT3333     ACCT2220
INTRO TO CHEMISTRY   CHEM2001      3            CHEM          ACCT3333     ACCT2220
ORGANIC CHEMISTRY    CHEM3001      3            CHEM          ACCT3333     ACCT2220
INTRO TO COMPUTER SC COSC1310      4            COSC          ACCT3333     ACCT2220
TURBO PASCAL         COSC2025      3            COSC          ACCT3333     ACCT2220
ADVANCED COBOL       COSC2303      3            COSC          ACCT3333     ACCT2220
DATA STRUCTURES      COSC3320      4            COSC          ACCT3333     ACCT2220
.
.
.
 (384 row(s) affected)

As we pointed out earlier, before combining tables, it is a good idea to get a count of the number of rows one might expect. This can be done by:

SELECT COUNT(*) AS [COUNT OF CARTESIAN]
FROM Course c, Prereq p

which produces the following output:

COUNT OF CARTESIAN
------------------
384
 
(1 row(s) affected)

From these results, we can see that the results of a Cartesian "join" will be a relation, say Q, which will have n*m rows (where n is the number of rows from the first relation, and m is the number of rows from the second relation). In the preceding example, the result set has 384 rows (32 times 12), with all possible combinations of rows from the Course table and the Prereq table. If we compare these results with the results of the earlier query (with the WHERE clause), we can see that both the results have the same structure, but the earlier one has been row-filtered by the WHERE clause to include only those rows where there is equality between Course.course_number and Prereq.course_number. Put another way, the earlier results make more sense because they present only those rows that correspond to one another. In this example, the Cartesian product produces extra, meaningless rows.

Oftentimes, the Cartesian product is the result of a user having forgotten to use an appropriate WHERE clause in the SELECT statement when formulating a join using the WHERE format. Note that if the JOIN or INNER JOIN syntax (ANSI JOIN syntax) is used, one cannot avoid the ON clause (no ON clause produces a syntax error). Hence, producing a Cartesian product inadvertently in SQL Server using the JOIN/INNER JOIN is much harder to do.

4.2.1. Uses of the Cartesian Product

Though the Cartesian product is generally regarded as not so useful in SQL per se, if harnessed properly, a Cartesian product can be used to produce exceptionally useful result sets, for example:

  • The Cartesian product can be used to generate sample or test data.

  • The simplest Cartesian product of two sets is a two-dimensional table or a cross-tabulation whose cells may be used to enter frequencies or to designate possibilities.

  • The Cartesian product is needed if you want a collection of all ordered n-tuples (rows with n columns) that can be formed so that they contain one element of the first set, one element of the second set, . . ., and one element of the nth set. For example, if set (or table) X is the 13-element set { A, K, Q, J, 10, 9, 8, 7, 6, 5, 4, 3, 2} and set (or table) Y is the 4-element set {spades, hearts, diamonds, clubs}, then the Cartesian product of those two sets is the 52-element set { (A, spades), (K, spades), . . ., (2, spades), (A, hearts), . . ., (3, clubs), (2, clubs) }.

4.2.2. CROSS JOIN Used to Generate a Cartesian Product

In SQL Server, a CROSS JOIN can be used to return a Cartesian product of two tables. The form of the CROSS JOIN is:

SELECT *
FROM Table1 CROSS JOIN Table2

Using our database, Student_course, the following CROSS JOIN would produce the same result (Cartesian product) as the query (without the WHERE clause) used in the earlier section:

SELECT *
FROM Course CROSS JOIN Prereq p


Previous Page
Next Page