4.2. The Cartesian Product
In a SQL statement, a Cartesian product
is where every row of the first table in the FROM clause is joined with each and every row
of the second table in the FROM clause. A Cartesian product is produced when the
WHERE form of the JOIN is used without the WHERE. An example of a
Cartesian product (join) would be:
SELECT *
FROM Course c, Prereq p
The preceding command combines all the data in both the tables and makes a new result
set. All rows in the Course table are matched with all rows in the Prereq table (a
Cartesian product). This produces 384 rows of output, of which we show the first 10 rows here:
COURSE_NAME COURSE_NUMBER CREDIT_HOURS OFFERING_DEPT COURSE_NUMBER PREREQ
-------------------- ------------- ------------ ------------- ------------- -------
ACCOUNTING I ACCT2020 3 ACCT ACCT3333 ACCT2220
ACCOUNTING II ACCT2220 3 ACCT ACCT3333 ACCT2220
MANAGERIAL FINANCE ACCT3333 3 ACCT ACCT3333 ACCT2220
ACCOUNTING INFO SYST ACCT3464 3 ACCT ACCT3333 ACCT2220
INTRO TO CHEMISTRY CHEM2001 3 CHEM ACCT3333 ACCT2220
ORGANIC CHEMISTRY CHEM3001 3 CHEM ACCT3333 ACCT2220
INTRO TO COMPUTER SC COSC1310 4 COSC ACCT3333 ACCT2220
TURBO PASCAL COSC2025 3 COSC ACCT3333 ACCT2220
ADVANCED COBOL COSC2303 3 COSC ACCT3333 ACCT2220
DATA STRUCTURES COSC3320 4 COSC ACCT3333 ACCT2220
.
.
.
(384 row(s) affected)
As we pointed out earlier, before combining tables, it is a good idea to get a count
of the number of rows one might expect. This can be done by:
SELECT COUNT(*) AS [COUNT OF CARTESIAN]
FROM Course c, Prereq p
which produces the following output:
COUNT OF CARTESIAN
------------------
384
(1 row(s) affected)
From these results, we can see that the results of a Cartesian "join" will be a
relation, say Q, which will have n*m rows (where n is the number of rows from the first relation, and m is the number of rows from the second relation). In the preceding
example, the result set has 384 rows (32 times 12), with all possible combinations of rows from the
Course table and the Prereq table. If we compare these results with the results of
the earlier query (with the WHERE clause), we can see that both the results have the same
structure, but the earlier one has been row-filtered by the WHERE clause to include only
those rows where there is equality between Course.course_number and
Prereq.course_number. Put another way, the earlier results make more sense because they
present only those rows that correspond to one another. In this example, the Cartesian product
produces extra, meaningless rows.
Oftentimes, the Cartesian product is the result of a user having forgotten to use an
appropriate WHERE clause in the SELECT statement when formulating a join using the
WHERE format. Note that if the JOIN or INNER JOIN syntax (ANSI
JOIN syntax) is used, one cannot avoid the ON clause (no ON clause
produces a syntax error). Hence, producing a Cartesian product inadvertently in SQL Server using the
JOIN/INNER JOIN is much harder to do.
4.2.1. Uses of the Cartesian Product
Though the Cartesian product is generally regarded as not so useful in SQL per se, if
harnessed properly, a Cartesian product can be used to produce exceptionally useful result sets, for
example:
-
The Cartesian product can be used to generate sample or test data.
-
The simplest Cartesian product of two sets is a two-dimensional table or a
cross-tabulation whose cells may be used to enter frequencies or to designate possibilities.
-
The Cartesian product is needed if you want a collection of all ordered n-tuples (rows with n columns) that can be formed so
that they contain one element of the first set, one element of the second set, . . ., and
one element of the nth set. For example, if set (or table)
X is the 13-element set { A, K, Q, J, 10, 9, 8, 7, 6, 5, 4,
3, 2} and set (or table) Y is the 4-element set {spades,
hearts, diamonds, clubs}, then the Cartesian product of those two sets is the 52-element set
{ (A, spades), (K, spades), . . ., (2, spades), (A, hearts), . . ., (3, clubs), (2, clubs)
}.
4.2.2. CROSS JOIN Used to Generate a Cartesian Product
In SQL Server, a CROSS JOIN can be used to return a Cartesian product of two
tables. The form of the CROSS JOIN is:
SELECT *
FROM Table1 CROSS JOIN Table2
Using our database, Student_course, the following CROSS JOIN would
produce the same result (Cartesian product) as the query (without the WHERE clause) used in
the earlier section:
SELECT *
FROM Course CROSS JOIN Prereq p
|