Section 5.3. Other Functions

5.3. Other Functions

This section discusses some other useful functions, such as TOP, TOP with PERCENT, and DISTINCT. These functions help us in selecting rows from a larger set of rows.

5.3.1. The TOP Function

This function returns a certain number of rows. Often, the TOP function is used to display or return from a result set the rows that fall at the top of a range specified by an ORDER BY clause. Suppose you want the names of the "top 2" (first two) employees with the lowest wages from the Employee table (top 2 refers to the results in the first two rows). You would type:

    SELECT TOP 2 names, wage
    FROM   Employee
    ORDER  BY wage ASC

This query would produce the following output:

    names           wage
    --------------- ------------
    Ed Evans        NULL
    Sumon Bagui     10.00

    (2 row(s) affected)

To get this output, first the wage column was ordered in ascending order, and then the "top" two wages were selected from that ordered result set. The columns with the null wages are placed first with the ascending (ASC) command.

With the TOP command, if you do not include the ORDER BY clause (and the table has no primary key), the query will return rows based on the order in which the rows appear in the table (probably, but not guaranteed to be, the order in which the rows were entered in the table). For example, the following query does not include the ORDER BY clause:

    SELECT TOP 2 names, wage
    FROM   Employee

And this query returns the following output:

    names           wage
    --------------- ------------
    Sumon Bagui     10.00
    Sudip Bagui     15.00

    (2 row(s) affected)

Remember that in relational database, you can never depend on where rows in a table are. Tables are sets of rows and at times the database engines may insert rows in unoccupied physical spaces. You should never count on retrieving rows in some order and always use ORDER BY if you desire an ordering.

5.3.1.1. Handling the "BOTTOM"

Since there is only a TOP command, and no similar BOTTOM command, if you want to get the "bottom" two employees meaning, the employees with the highest wages (the values in the last two ordered rows) instead of the top two employees from the Employee table, the top two employees (the highest wages) would have to be selected from the table ordered in descending order, as follows:

    SELECT TOP 2 names, wage
    FROM   Employee
    ORDER  BY wage DESC

This query would produce the following output:

    names           wage
    --------------- ------------
    Genny George    20.00
    Priyashi Saha   18.00

    (2 row(s) affected)

5.3.1.2. Handling a tie

This section answers an interesting questionwhat if there is a tie? For example, what if you are looking for the top two wages, and two employees have the same amount in the wage column? To handle ties, SQL Server has a WITH TIES option that can be used with the TOP function.

To demonstrate WITH TIES, make one change in the data in your Employee table, so that the value in the wage column of Sudip Bagui is also 10, as shown here:

    names           wage         hours
    --------------- ------------ ------------
    Sumon Bagui     10.0000      40
    Sudip Bagui     10.0000      30
    Priyashi Saha   18.0000      NULL
    Ed Evans        NULL         10
    Genny George    20.0000      40

    (5 row(s) affected)

You can use the following UPDATE statement to make the change in the Employee table:

    UPDATE Employee
    SET WAGE = 10
    WHERE names LIKE '%Sudip%'

The LIKE operator is explained later in the chapter.

You can also make this change in the Employee table by right-clicking on the table from your Object Explorer and selecting Open Table and changing the data.

Now type the following query:

    SELECT TOP 
 2 WITH TIES names, wage
    FROM   Employee
    ORDER  BY wage ASC

Although you requested only the TOP 2 employees, this query produced three rows, because there was a tie in the column that you were looking for (and you used with the WITH TIES option), as shown by the following output:

    names           wage
    --------------- ------------
    Ed Evans        NULL
    Sumon Bagui     10.00
    Sudip Bagui     10.00

    (3 row(s) affected)

The WITH TIES option is not allowed without a corresponding ORDER BY clause.

Remember to change the data in your Employee table back to its original state if you are doing the exercises as you read the material.

5.3.2. The TOP Function with PERCENT

PERCENT returns a certain percentage of rows that fall at the top of a specified range. For example, the following query returns the top 10 percent (by count) of the student names from the Student table based on the order of names:

    SELECT TOP 10 PERCENT sname
    FROM   Student
    ORDER  BY sname ASC

This query produces the following output:

    sname
    --------------------
    Alan
    Benny
    Bill
    Brad
    Brenda

    (5 row(s) affected)

Again, there is no BOTTOM PERCENT function, so in order to get the bottom 10 percent, you would have to order the sname column in descending order and then select the top 10 percent, as follows:

    SELECT TOP 10 PERCENT sname
    FROM   Student
    ORDER  BY sname DESC

This query would produce the following output:

    sname
    --------------------
    Zelda
    Thornton
    Susan
    Steve
    Stephanie

    (5 row(s) affected)

Note that the query can be used without the ORDER BY, but because the rows are unordered, the result is simply a sample of the first 10 percent of the data drawn from the table. Here is the same query without the use of the ORDER BY:

    SELECT TOP 10 PERCENT sname
    FROM   Student

As output, this query returns the first 10 percent of the names based on the number of rows. But, as the rows are unordered (and there is no primary key in this table), your output would depend on where in the database these rows reside:

    sname
    --------------------
    Lineas
    Mary
    Zelda
    Ken
    Mario

    (5 row(s) affected)

Once again, ties in this section could be handled in the same way as they were handled in the preceding section, with the WITH TIES option as shown:

    SELECT TOP 10 PERCENT WITH TIES sname
    FROM   Student
    ORDER  BY sname DESC

The WITH TIES option cannot be used without a corresponding ORDER BY clause.

5.3.3. The DISTINCT Function

The DISTINCT function omits rows in the result set that contain duplicate data in the selected columns. For example, to SELECT all grades from the Grade_report table, you could type:

    SELECT grade
    FROM   Grade_report

This query results in 209 rows, all the grades in the Grade_report table.

To SELECT all distinct grades from the Grade_report table, you would type:

    SELECT DISTINCT grade
    FROM   Grade_report

The result set would look like this:

    grade
    -----
    NULL
    A
    B
    C
    D
    F

    (6 row(s) affected)

Observe that the syntax requires you to put the word DISTINCT first in the string of attributes, because DISTINCT implies distinct rows in the result set. The preceding statement also produces a row for null grades (regarded here as a DISTINCT grade). Note also that the result set is sorted (ordered). The fact that the result set is sorted could cause some response inefficiency in larger table queries.

5.3.3.1. Using DISTINCT with other aggregate functions

In SQL Server , DISTINCT can also be used as an option with aggregate functions like COUNT, SUM and AVG. For example, to count the distinct grades from the Grade_report table, we can type:

    SELECT "Count of distinct grades" = COUNT(DISTINCT(grade))
    FROM    Grade_report

This query will give:

    Count of distinct grades
    -----------------------
    5

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (1 row(s) affected)

Because an aggregate function, COUNT, is being used here with an argument, NULL values are not included in this result set.

As another example, to sum the distinct wages from the Employee table, we can type:

    SELECT "Sum of distinct wages" = SUM(DISTINCT(wage))
    FROM    Employee

This query will give:

    Sum of distinct wages
    ---------------------
    63.00

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (1 row(s) affected)