5.3. Other Functions
This section discusses some other useful functions, such as TOP,
TOP with PERCENT, and DISTINCT. These functions help us in selecting rows
from a larger set of rows.
5.3.1. The TOP Function
This function returns a certain number of rows. Often, the TOP function is
used to display or return from a result set the rows that fall at the top of a range specified by an
ORDER BY clause. Suppose you want the names of the "top 2" (first two) employees with the
lowest wages from the Employee table (top 2 refers to the results in the first two
rows). You would type:
SELECT TOP 2 names, wage
FROM Employee
ORDER BY wage ASC
This query would produce the following output:
names wage
--------------- ------------
Ed Evans NULL
Sumon Bagui 10.00
(2 row(s) affected)
To get this output, first the wage column was ordered in ascending order,
and then the "top" two wages were selected from that ordered result set. The columns with the null
wages are placed first with the ascending (ASC) command.
With the TOP command, if you do not include the ORDER BY clause
(and the table has no primary key), the query will return rows based on the order in which the rows
appear in the table (probably, but not guaranteed to be, the order in which the rows were entered in
the table). For example, the following query does not include the ORDER BY clause:
SELECT TOP 2 names, wage
FROM Employee
And this query returns the following output:
names wage
--------------- ------------
Sumon Bagui 10.00
Sudip Bagui 15.00
(2 row(s) affected)
Remember that in relational database, you can never depend on where rows in a table
are. Tables are sets of rows and at times the database engines may insert rows in unoccupied
physical spaces. You should never count on retrieving rows in some order and always use ORDER
BY if you desire an ordering.
5.3.1.1. Handling the "BOTTOM"
Since there is only a TOP command, and no similar BOTTOM command,
if you want to get the "bottom" two employees meaning, the employees with the highest wages (the
values in the last two ordered rows) instead of the top two employees from the Employee
table, the top two employees (the highest wages) would have to be selected from the table ordered in
descending order, as follows:
SELECT TOP 2 names, wage
FROM Employee
ORDER BY wage DESC
This query would produce the following output:
names wage
--------------- ------------
Genny George 20.00
Priyashi Saha 18.00
(2 row(s) affected)
5.3.1.2. Handling a tie
This section answers an interesting questionwhat if there is a tie? For example, what
if you are looking for the top two wages, and two employees have the same amount in the
wage column? To handle ties, SQL Server has a WITH TIES option that can be used
with the TOP function.
To demonstrate WITH TIES, make one change in the data in your
Employee table, so that the value in the wage column of Sudip Bagui is
also 10, as shown here:
names wage hours
--------------- ------------ ------------
Sumon Bagui 10.0000 40
Sudip Bagui 10.0000 30
Priyashi Saha 18.0000 NULL
Ed Evans NULL 10
Genny George 20.0000 40
(5 row(s) affected)
You can use the following UPDATE statement to make the change in the
Employee table:
UPDATE Employee
SET WAGE = 10
WHERE names LIKE '%Sudip%'
 |
The LIKE operator is explained later in the chapter.
|
|
 |
You can also make this change in the Employee table
by right-clicking on the table from your Object Explorer and selecting Open
Table and changing the data.
|
|
Now type the following query:
SELECT TOP
2 WITH TIES names, wage
FROM Employee
ORDER BY wage ASC
Although you requested only the TOP 2 employees, this query produced three
rows, because there was a tie in the column that you were looking for (and you used with the
WITH TIES option), as shown by the following output:
names wage
--------------- ------------
Ed Evans NULL
Sumon Bagui 10.00
Sudip Bagui 10.00
(3 row(s) affected)
The WITH TIES option is not allowed without a corresponding ORDER
BY clause.
 |
Remember to change the data in your Employee table
back to its original state if you are doing the exercises as you read the
material.
|
|
5.3.2. The TOP Function with PERCENT
PERCENT returns a certain percentage of rows that fall at the top of a
specified range. For example, the following query returns the top 10 percent (by count) of the
student names from the Student table based on the order of names:
SELECT TOP 10 PERCENT sname
FROM Student
ORDER BY sname ASC
This query produces the following output:
sname
--------------------
Alan
Benny
Bill
Brad
Brenda
(5 row(s) affected)
Again, there is no BOTTOM PERCENT function, so in order to get the bottom 10
percent, you would have to order the sname column in descending order and then select the
top 10 percent, as follows:
SELECT TOP 10 PERCENT sname
FROM Student
ORDER BY sname DESC
This query would produce the following output:
sname
--------------------
Zelda
Thornton
Susan
Steve
Stephanie
(5 row(s) affected)
Note that the query can be used without the ORDER BY, but because the rows
are unordered, the result is simply a sample of the first 10 percent of the data drawn from the
table. Here is the same query without the use of the ORDER BY:
SELECT TOP 10 PERCENT sname
FROM Student
As output, this query returns the first 10 percent of the names based on the number
of rows. But, as the rows are unordered (and there is no primary key in this table), your output
would depend on where in the database these rows reside:
sname
--------------------
Lineas
Mary
Zelda
Ken
Mario
(5 row(s) affected)
Once again, ties in this section could be handled in the same way as they were
handled in the preceding section, with the WITH TIES option as shown:
SELECT TOP 10 PERCENT WITH TIES sname
FROM Student
ORDER BY sname DESC
 |
The WITH TIES option cannot be used without a
corresponding ORDER BY clause.
|
|
5.3.3. The DISTINCT Function
The DISTINCT function omits rows in the result set that contain duplicate
data in the selected columns. For example, to SELECT all grades from the
Grade_report table, you could type:
SELECT grade
FROM Grade_report
This query results in 209 rows, all the grades in the Grade_report table.
To SELECT all distinct grades from the
Grade_report table, you would type:
SELECT DISTINCT grade
FROM Grade_report
The result set would look like this:
grade
-----
NULL
A
B
C
D
F
(6 row(s) affected)
Observe that the syntax requires you to put the word DISTINCT first in the
string of attributes, because DISTINCT implies distinct rows in the result set. The
preceding statement also produces a row for null grades (regarded here as a DISTINCT
grade). Note also that the result set is sorted (ordered). The fact that the result set is sorted
could cause some response inefficiency in larger table queries.
5.3.3.1. Using DISTINCT with other aggregate functions
In SQL Server , DISTINCT can also be used as an option with aggregate
functions like COUNT, SUM and AVG. For example, to count the distinct
grades from the Grade_report table, we can type:
SELECT "Count of distinct grades" = COUNT(DISTINCT(grade))
FROM Grade_report
This query will give:
Count of distinct grades
-----------------------
5
Warning: Null value is eliminated by an aggregate or other SET operation.
(1 row(s) affected)
Because an aggregate function, COUNT, is being used here with an argument,
NULL values are not included in this result set.
As another example, to sum the distinct wages from the Employee table, we
can type:
SELECT "Sum of distinct wages" = SUM(DISTINCT(wage))
FROM Employee
This query will give:
Sum of distinct wages
---------------------
63.00
Warning: Null value is eliminated by an aggregate or other SET operation.
(1 row(s) affected)
|