Previous Page
Next Page

5.4. String Functions

SQL Server has several functions that operate on strings; for example, functions for the extraction of part of a string, functions to find the length of a string, functions to find matching characters in strings, etc. In this section, we explore some of these common and useful string functions. String functions are not aggregatesthey are row-level functions, as they operate on one value in one row at a time. String functions are read-only functions and will not change the underlying data in the database unless UPDATEs are performed. We start our discussion of string functions with string concatenation .

5.4.1. String Concatenation

String manipulations often require concatenation, which means to connect things together. In this section we look at the string concatenation operator available in SQL Server , the +.

To see an example of concatenation, using the Employee table, we will first list the names of the employees using the following statement:

    SELECT names
    FROM   Employee

This query produces the following output:

    names
    ---------------
    Sumon Bagui
    Sudip Bagui
    Priyashi Saha
    Ed Evans
    Genny George

    (5 row(s) affected)

Now, suppose you would like to concatenate each of the names with ", Esq." Type the following:

    SELECT names + ', Esq.' AS [Employee Names]
    FROM   Employee

This query produces:

    Employee Names
    ---------------------
    Sumon Bagui, Esq.
    Sudip Bagui, Esq.
    Priyashi Saha, Esq.
    Ed Evans, Esq.
    Genny George, Esq.

    (5 row(s) affected)

As another example, suppose you want to add a series of dots (.....) to the left side of the names column. You would type:

    SELECT ('.....'+ names) AS [Employee Names]
    FROM   Employee

to produce the following result set:

    Employee Names
    --------------------
    .....Sumon Bagui
    .....Sudip Bagui
    .....Priyashi Saha
    .....Ed Evans
    .....Genny George

    (5 row(s) affected)

Similarly, to add ..... to the right side of names column, type:

    SELECT (names + '.....') AS [Employee Names]
    FROM    Employee

This query returns:

    Employee Names
    --------------------
    Sumon Bagui.....
    Sudip Bagui.....
    Priyashi Saha.....
    Ed Evans.....
    Genny George.....

    (5 row(s) affected)

5.4.2. String Extractors

SQL has several string extractor functions. This section briefly describes some of the more useful string extractors, like SUBSTRING, LEFT, RIGHT, LTRIM, RTRIM, and CHARINDEX. Now suppose (again) that the Employee table has the following data:

    names           wage         hours
    --------------- ------------ -----------
    Sumon Bagui     10.0000      40
    Sudip Bagui     15.0000      30
    Priyashi Saha   18.0000      NULL
    Ed Evans        NULL         10
    Genny George    20.0000      40

    (5 row(s) affected)

And suppose you want to display the names in the following format:

    Employee Names
    ------------------------
    Sumon, B.
    Sudip, B.
    Priyashi, S.
    Ed, E.
    Genny, G.

    (5 row(s) affected)

You can achieve this output by using a combination of the string functions to break down names into parts, re-assemble (concatenate) those parts, and then concatenate a comma and period in their respective (appropriate) locations. Before we completely solve this particular problem, in the next few sections we will explain the string functions that you will need to get this output. Then we will show you how to get this result.

5.4.2.1. The SUBSTRING function

The SUBSTRING function returns part of a string. Following is the format for the SUBSTRING function:

    SUBSTRING(stringexpression, startposition, length)

stringexpression is the column that we will be using, startposition tells SQL Server where in the stringexpression to start retrieving characters from, and length tells SQL Server how many characters to extract. All three parameters are required in SQL Server's SUBSTRING function. For example, type the following:

    SELECT names, SUBSTRING(names,2,4) AS [middle of names]
    FROM   Employee

This query returns:

    names           middle of names
    --------------- ---------------
    Sumon Bagui     umon
    Sudip Bagui     udip
    Priyashi Saha   riya
    Ed Evans        d Ev
    Genny George    enny

    (5 row(s) affected)

SUBSTRING(names,2,4) started from the second position in the column, names, and extracted four characters starting from position 2.

Strings in SQL Server are indexed from 1. If you start at position 0, the following query will show you what you will get:

    SELECT names, "first letter of names" = SUBSTRING(names,0,2)
    FROM   Employee

You will get:

    names           first letter of names
    --------------- ---------------------
    Sumon Bagui     S
    Sudip Bagui     S
    Priyashi Saha   P
    Ed Evans        E
    Genny George    G

    (5 row(s) affected)

In the previous output, we got the first letter of the names because the SUBSTRING function started extracting characters starting from position zero (the position before the first letter) and went two character positionswhich picked up the first letter of the names field.

We could have also achieved the same output with:

    SELECT names, "first letter of names" = SUBSTRING(names,1,1)
    FROM   Employee

Here the SUBSTRING function would start extracting characters starting from position 1 and go only one character position, hence ending up with only one characterwhich picks up the first letter of the names field.

SQL Server's SUBSTRING function actually allows you to start at a negative position relative to the string. For example, if you typed:

    SELECT names, "first letter of names" = SUBSTRING(names,-1,3)
    FROM   Employee

You would get the same output as the previous query also, because you are starting two positions before the first character of names, and going three character places, so you get the first letter of the name.

5.4.2.2. The LEFT and RIGHT functions

These functions return a portion of a string, starting from either the left or right side of stringexpression. Following are the general formats for the LEFT and RIGHT functions respectively:

    LEFT(stringexpression, n)

Or:

    RIGHT(stringexpression, n)

The LEFT function starts from the LEFT of the stringexpression or column and returns n characters, and the RIGHT function starts from the right of the stringexpression or column and returns n characters.

For example, to get the first three characters from the names column, type:

    SELECT names, LEFT(names,3) AS [left]
    FROM   Employee

This query produces:

    names           left
    --------------- ----
    Sumon Bagui     Sum
    Sudip Bagui     Sud
    Priyashi Saha   Pri
    Ed Evans        Ed
    Genny George    Gen

    (5 row(s) affected)

To get the last three characters from the names column (here the count will start from the right of the column, names), type:

    SELECT names, RIGHT(names,3) AS [right]
    FROM   Employee

This query produces:

    names           right
    --------------- -------
    Sumon Bagui     gui
    Sudip Bagui     gui
    Priyashi Saha   aha
    Ed Evans        ans
    Genny George    rge

    (5 row(s) affected)

5.4.2.3. The LTRIM and RTRIM functions

LTRIM removes blanks from the beginning (left) of a string. For example, if three blank spaces appear to the left of a string such as ' Ranu', you can remove the blank spaces with the following query:

    SELECT LTRIM('  Ranu') AS names

which produces:

    names
    -------
    Ranu

    (1 row(s) affected)

It does not matter how many blank spaces precede the non-blank character. All leading blanks will be excised.

Similarly, RTRIM removes blanks from the end (right) of a string. For example, if blank spaces appear to the right of Ranu in the names column, you could remove the blank spaces using the RTRIM, and then concatenate "Saha" with the + sign, as shown here:

    SELECT RTRIM('Ranu   ') + ' Saha' AS names

This query produces:

    names
    ------------
    Ranu Saha

    (1 row(s) affected)

5.4.2.4. The CHARINDEX function

The CHARINDEX function returns the starting position of a specified pattern. For example, if we wish to find the position of a space in the employee names in the Employee table, we could type:

    SELECT names, "Position of Space in Employee Names" = CHARINDEX(' ',names)
    FROM   Employee

This query would give:

    names           Position of Space in Employee Names
    --------------- -----------------------------------
    Sumon Bagui     6
    Sudip Bagui     6
    Priyashi Saha   9
    Ed Evans        3
    Genny George    6

    (5 row(s) affected)

In Oracle, CHARINDEX is called INSTR.


Now that you know how to use quite a few string extractor functions, you can combine them to produce the following output, which will require a nesting of string functions:

    Employee Names
    ------------------------
    Sumon, B.
    Sudip, B.
    Priyashi, S.
    Ed, E.
    Genny, G.

    (5 row(s) affected)

Following is the query to achieve the preceding output:

    SELECT "Employee Names" = SUBSTRING(names,1,CHARINDEX(' ',names)-1) + ', ' +
    SUBSTRING(names, CHARINDEX(' ',names)+1,1) + '.'
    FROM    Employee

In this query, we get the first name with the SUBSTRING(names,1,CHARINDEX(' ',names)-1) portion. SUBSTRING begins in the first position of names. CHARINDEX(' ',names) finds the first space. We need only the characters up to the first space, so we use CHARINDEX(' ',names) -1. We then concatenate the comma and a space with + (', ' ). Then, to extract the first character after the first space in the original names column, we use SUBSTRING(names, CHARINDEX(' ',names)+1,1), followed by concatenation of a period.

To display the names in a more useful mannerthat is, the last name, comma, and then the first initialwe would have to use the following query:

    SELECT "Employee Names" = SUBSTRING(names, (CHARINDEX(' ',names)+1 ), (CHARINDEX(' ',
    names))) + ', ' + SUBSTRING(names,1,1) + '.'
    FROM    Employee

which would produce the following output:

    Employee Names
    ------------------------
    Bagui, S.
    Bagui, S.
    Saha, P.
    Eva, E.
    George, G.

    (5 row(s) affected)

In this query, we get the last name with SUBSTRING(names, (CHARINDEX(' ',names)+1 ), (CHARINDEX(' ', names))). The SUBSTRING begins at the space and picks up the rest of the characters after the space. Then a comma and a space are concatenated, and then the first letter of the first name and a period are concatenated.

5.4.3. The UPPER and LOWER Functions

To produce all the fields in the result set (output) in uppercase or in lowercase, you can use the UPPER or LOWER functions. For example, to produce all the names in the Employee table in uppercase, type:

    SELECT UPPER(names) AS [NAMES IN CAPS]
    FROM   Employee

This query produces the following output:

    NAMES IN CAPS
    ------------------------
    SUMON BAGUI
    SUDIP BAGUI
    PRIYASHI SAHA
    ED EVANS
    GENNY GEORGE

    (5 row(s) affected)

To produce all the names in lowercase, you would type:

    SELECT LOWER(names) AS [NAMES IN SMALL]
    FROM   Employee

To further illustrate the nesting of functions, and to produce, in all uppercase, the first name followed by the first letter of the last name, type:

    SELECT "Employee Names" = UPPER(SUBSTRING(names,1,CHARINDEX(' ',names)-1)) + ', ' +
    SUBSTRING(names,CHARINDEX(' ',names)+1,1) + '.'
    FROM    Employee

This query produces the following output:

    Employee Names
    -----------------------------------
    SUMON, B.
    SUDIP, B.
    PRIYASHI, S.
    ED, E.
    GENNY, G.

    (5 row(s) affected)

5.4.4. The LEN Function

The LEN function returns the length (number of characters) of a desired string excluding trailing blanks. For example, to list the lengths of the full names (including any spaces) in the Employee table, type:

    SELECT names, LEN(names) AS [Length of Names]
    FROM   Employee

This query produces the following output:

    names           Length of Names
    --------------- ---------------
    Sumon Bagui     11
    Sudip Bagui     11
    Priyashi Saha   13
    Ed Evans        8
    Genny George    12

    (5 row(s) affected)

5.4.5. Matching Substrings Using LIKE

Often we want to use part of a string as a condition in a query. For example, consider the Section table (from our Student_course database), which has the following data:

    SECTION_ID COURSE_NUM SEMESTER YEAR INSTRUCTOR BLDG   ROOM
    ---------- ---------- -------- ---- ---------- ------ ------
    85         MATH2410   FALL     98   KING       36     123
    86         MATH5501   FALL     98   EMERSON    36     123
    87         ENGL3401   FALL     98   HILLARY    13     101
    .
    .
    .

We might want to know something about Math coursescourses with the prefix MATH. In this situation, we need an operator that can determine whether a substring exists in an attribute. Although we have seen how to handle this type of question with both the SUBSTRING and CHARINDEX functions, another common way to handle this situation in a WHERE clause is by using the LIKE function.

Using LIKE as an "existence" match entails finding whether a character string exists in a string or valueif the string exists, the row is SELECTed for inclusion in the result set. Again of course, we could use SUBSTRING and/or CHARINDEX for this, but LIKE is a powerful, common and flexible alternative. This existence-type of the LIKE query is useful when the position of the character string sought may be in various places in the substring. SQL Server uses the wildcard character, %, at the beginning or end of a LIKE-string, when looking for the existence of substrings. For example, suppose we want to find all names that have "Smith" in our Student table, type the following:

    SELECT *
    FROM   Student
    WHERE  sname = 'SMITH'

which produces the following output:

    STNO  SNAME        MAJOR  CLASS  BDATE
    ----- ----------- ------  -----  -------------------------------
    88    Smith        NULL   NULL   10/15/1979 12:00:00 AM

    (1 row(s) affected)

Note that the case (upper or lower) in the statement WHERE sname = 'SMITH' does not matter, because SQL Server is handled as if it is all uppercase (this is by default, and can be changed), although it is displayed in mixed case (and even if it had been entered in mixed case). In other words, we can say that data in SQL Server is not case-sensitive by default.

To count how many people have a name of "Smith," type:

    SELECT COUNT(*) AS Count
    FROM   Student
    WHERE  sname = 'Smith'

which produces:

    Count
    -----------
    1

    (1 row(s) affected)

5.4.5.1. Using the wildcard character with LIKE

The percentage sign (%) is SQL Server's wildcard character. For example, if we wanted to find all the names that had some form of "Smith" in their names from the Student table, we would use % on both ends of "Smith," as shown here:

    SELECT *
    FROM   Student
    WHERE  sname LIKE '%Smith%'

This query produces the following output, showing any "Smith" pattern in sname:

    STNO   SNAME                MAJOR CLASS  BDATE
    ------ -------------------- ----- ------ -----------------------
    88     Smith                NULL  NULL   1979-10-15 00:00:00
    147    Smithly              ENGL  2      1980-05-13 00:00:00
    151    Losmith              CHEM  3      1981-01-15 00:00:00

    (3 row(s) affected)

To find any pattern starting with "Smith" from the Student table, you would type:

    SELECT *
    FROM   Student
    WHERE  sname LIKE 'Smith%'

This query would produce:

    STNO   SNAME                MAJOR CLASS  BDATE
    ------ -------------------- ----- ------ -----------------------
    88     Smith                NULL  NULL   1979-10-15 00:00:00
    147    Smithly              ENGL  2      1980-05-13 00:00:00

    (2 row(s) affected)

By default, it is not necessary to use UPPER or LOWER before sname in the previous query since data in SQL Server 2005 is not case sensitive. You can change this however, by changing SQL Server's database configurations.


To find the Math courses (any course_num starting with MATH) from the Section table, you could pose a wildcard match with a LIKE as follows:

    SELECT *
    FROM   Section
    WHERE  course_num LIKE 'MATH%'

This query would produce the following output:

    SECTION_ID COURSE_NUM SEMESTER YEAR INSTRUCTOR BLDG  ROOM
    ---------- ---------- -------- ---- ---------- ----- -----
    85         MATH2410   FALL     98   KING       36    123
    86         MATH5501   FALL     98   EMERSON    36    123
    107        MATH2333   SPRING   00   CHANG      36    123
    109        MATH5501   FALL     99   CHANG      36    123
    112        MATH2410   FALL     99   CHANG      36    123
    158        MATH2410   SPRING   98   NULL       36    123

    (6 row(s) affected)

5.4.5.2. Finding a range of characters

SQL Server allows some POSIX-compliant regular expression patterns in LIKE clauses. We will illustrate some of these extensions for pattern matching.

LIKE can be used to find a range of characters. For example, to find all grades between C and F in the Grade_report table, type:

    SELECT DISTINCT student_number, grade
    FROM   Grade_report
    WHERE  grade LIKE '[c-f]'
    AND    student_number > 100

This query produces 15 rows of output:

    student_number grade
    -------------- -----
    125            C
    126            C
    127            C
    128            F
    130            C
    131            C
    145            F
    147            C
    148            C
    151            C
    153            C
    158            C
    160            C
    161            C
    163            C

    (15 row(s) affected)

By default, note that LIKE is also case-insensitive. You can change this, however, by changing SQL Server's database configurations.


To find all grades from the Grade_report table that are not between C and F, we use a caret (^) before the range we do not want to find:

    SELECT DISTINCT student_number, grade
    FROM   Grade_report
    WHERE  grade LIKE '[^c-f]'
    AND    student_number > 100

This query produces the following 21 rows of output:

    student_number grade
    -------------- -----
    121            B
    122            B
    123            A
    123            B
    125            A
    125            B
    126            A
    126            B
    127            A
    127            B
    129            A
    129            B
    132            B
    142            A
    143            B
    144            B
    146            B
    147            B
    148            B
    155            B
    157            B

    (21 row(s) affected)

As another example, to find all the courses from the Section table that start with "C," but do not have "h" as the second character, we could type:

    SELECT *
    FROM   Section
    WHERE  course_num LIKE 'C[^h]%'

This query would give the following 10 rows of output:

    SECTION_ID COURSE_NUM SEMESTER YEAR INSTRUCTOR BLDG  ROOM
    ---------- ---------- -------- ---- ---------- ----- -----
    90         COSC3380   SPRING   99   HARDESTY   79    179
    91         COSC3701   FALL     98   NULL       79    179
    92         COSC1310   FALL     98   ANDERSON   79    179
    93         COSC1310   SPRING   99   RAFAELT    79    179
    96         COSC2025   FALL     98   RAFAELT    79    179
    98         COSC3380   FALL     99   HARDESTY   79    179
    102        COSC3320   SPRING   99   KNUTH      79    179
    119        COSC1310   FALL     99   ANDERSON   79    179
    135        COSC3380   FALL     99   STONE      79    179
    145        COSC1310   SPRING   99   JONES      79    179

    (10 row(s) affected)

5.4.5.3. Finding a particular character

To find a particular character using LIKE, we would place the character in square brackets []. For example, to find all the names from the Student table that begin with a B or G and end in "ill," we could type:

    SELECT sname
    FROM    Student
    WHERE sname LIKE '[BG]ill'

We would get:

    sname
    --------------------
    Bill

    (1 row(s) affected)

5.4.5.4. Finding a single character or single digitthe underscore wildcard character

A single character or digit can be found in a particular position in a string by using an underscore, _, for the wildcard in that position in the string. For example, to find all students with student_numbers in the 130s (130...139) range from the Student table, type:

    SELECT DISTINCT student_number, grade
    FROM   Grade_report
    WHERE  student_number LIKE '13_'

This query would produce the following:

    student_number grade
    -------------- -----
    130            C
    131            C
    132            B

    (3 row(s) affected)

5.4.5.5. Using NOT LIKE

In SQL Server , the LIKE operator can be negated with the NOT. For example, to get a listing of the non math courses and the courses that do not start in "C" from the Section table, we would type:

    SELECT *
    FROM   Section
    WHERE  course_num NOT LIKE 'MATH%'
    AND    Course_num NOT LIKE 'C%'

This query would give the following 14 rows of output:

    SECTION_ID COURSE_NUM SEMESTER YEAR INSTRUCTOR BLDG   ROOM
    ---------- ---------- -------- ---- ---------- ------ ------
    87         ENGL3401   FALL     98   HILLARY    13     101
    88         ENGL3520   FALL     99   HILLARY    13     101
    89         ENGL3520   SPRING   99   HILLARY    13     101
    94         ACCT3464   FALL     98   RODRIGUEZ  74     NULL
    95         ACCT2220   SPRING   99   RODRIQUEZ  74     NULL
    97         ACCT3333   FALL     99   RODRIQUEZ  74     NULL
    99         ENGL3401   FALL     99   HILLARY    13     101
    100        POLY1201   FALL     99   SCHMIDT    NULL   NULL
    101        POLY2103   SPRING   00   SCHMIDT    NULL   NULL
    104        POLY4103   SPRING   00   SCHMIDT    NULL   NULL
    126        ENGL1010   FALL     98   HERMANO    13     101
    127        ENGL1011   SPRING   99   HERMANO    13     101
    133        ENGL1010   FALL     99   HERMANO    13     101
    134        ENGL1011   SPRING   00   HERMANO    13     101

    (14 row(s) affected)


Previous Page
Next Page