5.4. String Functions
SQL Server has several functions that operate on strings; for example, functions
for the extraction of part of a string, functions to find the length of a string, functions to find
matching characters in strings, etc. In this section, we explore some of these common and useful
string functions. String functions are not aggregatesthey are row-level functions, as they operate
on one value in one row at a time. String functions are read-only functions and will not change the
underlying data in the database unless UPDATEs are performed. We start our discussion of
string functions with string concatenation
.
5.4.1. String Concatenation
String manipulations often require concatenation,
which means to connect things together. In this section we look at the string concatenation operator
available in SQL Server , the +.
To see an example of concatenation, using the Employee table, we will first
list the names of the employees using the following statement:
SELECT names
FROM Employee
This query produces the following output:
names
---------------
Sumon Bagui
Sudip Bagui
Priyashi Saha
Ed Evans
Genny George
(5 row(s) affected)
Now, suppose you would like to concatenate each of the names with ", Esq." Type the
following:
SELECT names + ', Esq.' AS [Employee Names]
FROM Employee
This query produces:
Employee Names
---------------------
Sumon Bagui, Esq.
Sudip Bagui, Esq.
Priyashi Saha, Esq.
Ed Evans, Esq.
Genny George, Esq.
(5 row(s) affected)
As another example, suppose you want to add a series of dots (.....) to the
left side of the names column. You would type:
SELECT ('.....'+ names) AS [Employee Names]
FROM Employee
to produce the following result set:
Employee Names
--------------------
.....Sumon Bagui
.....Sudip Bagui
.....Priyashi Saha
.....Ed Evans
.....Genny George
(5 row(s) affected)
Similarly, to add ..... to the right side of names column, type:
SELECT (names + '.....') AS [Employee Names]
FROM Employee
This query returns:
Employee Names
--------------------
Sumon Bagui.....
Sudip Bagui.....
Priyashi Saha.....
Ed Evans.....
Genny George.....
(5 row(s) affected)
5.4.2. String Extractors
SQL has several string extractor functions. This section briefly describes some of
the more useful string extractors, like SUBSTRING,
LEFT, RIGHT, LTRIM, RTRIM, and CHARINDEX. Now suppose
(again) that the Employee table has the following data:
names wage hours
--------------- ------------ -----------
Sumon Bagui 10.0000 40
Sudip Bagui 15.0000 30
Priyashi Saha 18.0000 NULL
Ed Evans NULL 10
Genny George 20.0000 40
(5 row(s) affected)
And suppose you want to display the names in the following format:
Employee Names
------------------------
Sumon, B.
Sudip, B.
Priyashi, S.
Ed, E.
Genny, G.
(5 row(s) affected)
You can achieve this output by using a combination of the string functions to break
down names into parts, re-assemble (concatenate) those parts, and then concatenate a comma
and period in their respective (appropriate) locations. Before we completely solve this particular
problem, in the next few sections we will explain the string functions that you will need to get
this output. Then we will show you how to get this result.
5.4.2.1. The SUBSTRING function
The SUBSTRING function returns part of a string. Following is the format for
the SUBSTRING function:
SUBSTRING(stringexpression, startposition, length)
stringexpression is the column that we will be using,
startposition tells SQL Server where in the stringexpression to
start retrieving characters from, and length tells SQL Server how many characters to
extract. All three parameters are required in SQL Server's
SUBSTRING function. For example, type the following:
SELECT names, SUBSTRING(names,2,4) AS [middle of names]
FROM Employee
This query returns:
names middle of names
--------------- ---------------
Sumon Bagui umon
Sudip Bagui udip
Priyashi Saha riya
Ed Evans d Ev
Genny George enny
(5 row(s) affected)
SUBSTRING(names,2,4) started from the second position in the column,
names, and extracted four characters starting from position 2.
Strings in SQL Server are indexed from 1. If you start at position 0, the
following query will show you what you will get:
SELECT names, "first letter of names" = SUBSTRING(names,0,2)
FROM Employee
You will get:
names first letter of names
--------------- ---------------------
Sumon Bagui S
Sudip Bagui S
Priyashi Saha P
Ed Evans E
Genny George G
(5 row(s) affected)
In the previous output, we got the first letter of the names because the
SUBSTRING function started extracting characters starting from position zero (the position
before the first letter) and went two character positionswhich picked up the first letter of the
names field.
We could have also achieved the same output with:
SELECT names, "first letter of names" = SUBSTRING(names,1,1)
FROM Employee
Here the SUBSTRING function would start extracting characters starting from
position 1 and go only one character position, hence ending up with only one characterwhich picks up
the first letter of the names field.
SQL Server's SUBSTRING function actually allows you to start at a negative
position relative to the string. For example, if you typed:
SELECT names, "first letter of names" = SUBSTRING(names,-1,3)
FROM Employee
You would get the same output as the previous query also, because you are starting
two positions before the first character of names, and going three character places, so you
get the first letter of the name.
5.4.2.2. The LEFT and RIGHT functions
These functions return a portion of a string, starting from either the left or right
side of stringexpression. Following are the general formats for the LEFT
and RIGHT functions respectively:
LEFT(stringexpression, n)
Or:
RIGHT(stringexpression, n)
The LEFT function starts from the LEFT of the
stringexpression or column and returns n characters, and the
RIGHT function starts from the right of the stringexpression or column and
returns n characters.
For example, to get the first three characters from the names column, type:
SELECT names, LEFT(names,3) AS [left]
FROM Employee
This query produces:
names left
--------------- ----
Sumon Bagui Sum
Sudip Bagui Sud
Priyashi Saha Pri
Ed Evans Ed
Genny George Gen
(5 row(s) affected)
To get the last three characters from the names column (here the count will
start from the right of the column, names), type:
SELECT names, RIGHT(names,3) AS [right]
FROM Employee
This query produces:
names right
--------------- -------
Sumon Bagui gui
Sudip Bagui gui
Priyashi Saha aha
Ed Evans ans
Genny George rge
(5 row(s) affected)
5.4.2.3. The LTRIM and RTRIM functions
LTRIM removes blanks from the beginning (left) of a string. For example, if
three blank spaces appear to the left of a string such as ' Ranu', you can remove the blank
spaces with the following query:
SELECT LTRIM(' Ranu') AS names
which produces:
names
-------
Ranu
(1 row(s) affected)
It does not matter how many blank spaces precede the non-blank character. All leading
blanks will be excised.
Similarly, RTRIM removes blanks from the end (right) of a string. For
example, if blank spaces appear to the right of Ranu in the names column, you could remove
the blank spaces using the RTRIM, and then concatenate "Saha" with the + sign, as shown
here:
SELECT RTRIM('Ranu ') + ' Saha' AS names
This query produces:
names
------------
Ranu Saha
(1 row(s) affected)
5.4.2.4. The CHARINDEX function
The CHARINDEX function returns the starting position of a specified pattern.
For example, if we wish to find the position of a space in the employee names in the
Employee table, we could type:
SELECT names, "Position of Space in Employee Names" = CHARINDEX(' ',names)
FROM Employee
This query would give:
names Position of Space in Employee Names
--------------- -----------------------------------
Sumon Bagui 6
Sudip Bagui 6
Priyashi Saha 9
Ed Evans 3
Genny George 6
(5 row(s) affected)
 |
In Oracle, CHARINDEX is called INSTR.
|
|
Now that you know how to use quite a few string extractor functions, you can combine
them to produce the following output, which will require a nesting of string functions:
Employee Names
------------------------
Sumon, B.
Sudip, B.
Priyashi, S.
Ed, E.
Genny, G.
(5 row(s) affected)
Following is the query to achieve the preceding output:
SELECT "Employee Names" = SUBSTRING(names,1,CHARINDEX(' ',names)-1) + ', ' +
SUBSTRING(names, CHARINDEX(' ',names)+1,1) + '.'
FROM Employee
In this query, we get the first name with the SUBSTRING(names,1,CHARINDEX('
',names)-1) portion. SUBSTRING begins in the first position of names.
CHARINDEX(' ',names) finds the first space. We need only the characters up to the first
space, so we use CHARINDEX(' ',names) -1. We then concatenate the comma and a space with
+ (', ' ). Then, to extract the first character after the first space in the original
names column, we use SUBSTRING(names, CHARINDEX(' ',names)+1,1), followed by
concatenation of a period.
To display the names in a more useful mannerthat is, the last name, comma, and then
the first initialwe would have to use the following query:
SELECT "Employee Names" = SUBSTRING(names, (CHARINDEX(' ',names)+1 ), (CHARINDEX(' ',
names))) + ', ' + SUBSTRING(names,1,1) + '.'
FROM Employee
which would produce the following output:
Employee Names
------------------------
Bagui, S.
Bagui, S.
Saha, P.
Eva, E.
George, G.
(5 row(s) affected)
In this query, we get the last name with SUBSTRING(names, (CHARINDEX(' ',names)+1
), (CHARINDEX(' ', names))). The SUBSTRING begins at the space and picks
up the rest of the characters after the space. Then a comma and a space are concatenated, and then
the first letter of the first name and a period are concatenated.
5.4.3. The UPPER and LOWER Functions
To produce all the fields in the result set (output) in uppercase or in lowercase,
you can use the UPPER or LOWER functions. For example, to produce all the names in
the Employee table in uppercase, type:
SELECT UPPER(names) AS [NAMES IN CAPS]
FROM Employee
This query produces the following output:
NAMES IN CAPS
------------------------
SUMON BAGUI
SUDIP BAGUI
PRIYASHI SAHA
ED EVANS
GENNY GEORGE
(5 row(s) affected)
To produce all the names in lowercase, you would type:
SELECT LOWER(names) AS [NAMES IN SMALL]
FROM Employee
To further illustrate the nesting of functions, and to produce, in all uppercase, the
first name followed by the first letter of the last name, type:
SELECT "Employee Names" = UPPER(SUBSTRING(names,1,CHARINDEX(' ',names)-1)) + ', ' +
SUBSTRING(names,CHARINDEX(' ',names)+1,1) + '.'
FROM Employee
This query produces the following output:
Employee Names
-----------------------------------
SUMON, B.
SUDIP, B.
PRIYASHI, S.
ED, E.
GENNY, G.
(5 row(s) affected)
5.4.4. The LEN Function
The LEN function returns the length (number of characters) of a desired
string excluding trailing blanks. For example, to list the lengths of the full names (including any
spaces) in the Employee table, type:
SELECT names, LEN(names) AS [Length of Names]
FROM Employee
This query produces the following output:
names Length of Names
--------------- ---------------
Sumon Bagui 11
Sudip Bagui 11
Priyashi Saha 13
Ed Evans 8
Genny George 12
(5 row(s) affected)
5.4.5. Matching Substrings Using LIKE
Often we want to use part of a string as a condition in a query. For example,
consider the Section table (from our Student_course database), which has the
following data:
SECTION_ID COURSE_NUM SEMESTER YEAR INSTRUCTOR BLDG ROOM
---------- ---------- -------- ---- ---------- ------ ------
85 MATH2410 FALL 98 KING 36 123
86 MATH5501 FALL 98 EMERSON 36 123
87 ENGL3401 FALL 98 HILLARY 13 101
.
.
.
We might want to know something about Math coursescourses with the prefix
MATH. In this situation, we need an operator that can determine whether a substring exists
in an attribute. Although we have seen how to handle this type of question with both the
SUBSTRING and CHARINDEX functions, another common way to handle this situation in
a WHERE clause is by using the LIKE function.
Using LIKE as an "existence" match entails finding whether a character
string exists in a string or valueif the string exists, the row is SELECTed for inclusion
in the result set. Again of course, we could use SUBSTRING and/or CHARINDEX for
this, but LIKE is a powerful, common and flexible alternative. This existence-type of the
LIKE query is useful when the position of the character string sought may be in various
places in the substring. SQL Server uses the wildcard character, %, at the beginning
or end of a LIKE-string, when looking for the existence of substrings. For example, suppose
we want to find all names that have "Smith" in our Student table, type the following:
SELECT *
FROM Student
WHERE sname = 'SMITH'
which produces the following output:
STNO SNAME MAJOR CLASS BDATE
----- ----------- ------ ----- -------------------------------
88 Smith NULL NULL 10/15/1979 12:00:00 AM
(1 row(s) affected)
Note that the case (upper or lower) in the statement WHERE sname = 'SMITH'
does not matter, because SQL Server is handled as if it is all uppercase (this is by default,
and can be changed), although it is displayed in mixed case (and even if it had been entered in
mixed case). In other words, we can say that data in SQL Server is not case-sensitive by default.
To count how many people have a name of "Smith," type:
SELECT COUNT(*) AS Count
FROM Student
WHERE sname = 'Smith'
which produces:
Count
-----------
1
(1 row(s) affected)
5.4.5.1. Using the wildcard character with LIKE
The percentage sign (%) is SQL Server's wildcard character. For example, if
we wanted to find all the names that had some form of "Smith" in their names from the Student table,
we would use % on both ends of "Smith," as shown here:
SELECT *
FROM Student
WHERE sname LIKE '%Smith%'
This query produces the following output, showing any "Smith" pattern in
sname:
STNO SNAME MAJOR CLASS BDATE
------ -------------------- ----- ------ -----------------------
88 Smith NULL NULL 1979-10-15 00:00:00
147 Smithly ENGL 2 1980-05-13 00:00:00
151 Losmith CHEM 3 1981-01-15 00:00:00
(3 row(s) affected)
To find any pattern starting with "Smith" from the Student table, you would
type:
SELECT *
FROM Student
WHERE sname LIKE 'Smith%'
This query would produce:
STNO SNAME MAJOR CLASS BDATE
------ -------------------- ----- ------ -----------------------
88 Smith NULL NULL 1979-10-15 00:00:00
147 Smithly ENGL 2 1980-05-13 00:00:00
(2 row(s) affected)
 |
By default, it is not necessary to use UPPER or
LOWER before sname in the previous query since data in SQL
Server 2005 is not case sensitive. You can change this however, by changing
SQL Server's database configurations.
|
|
To find the Math courses (any course_num starting with MATH) from
the Section table, you could pose a wildcard match with a LIKE as follows:
SELECT *
FROM Section
WHERE course_num LIKE 'MATH%'
This query would produce the following output:
SECTION_ID COURSE_NUM SEMESTER YEAR INSTRUCTOR BLDG ROOM
---------- ---------- -------- ---- ---------- ----- -----
85 MATH2410 FALL 98 KING 36 123
86 MATH5501 FALL 98 EMERSON 36 123
107 MATH2333 SPRING 00 CHANG 36 123
109 MATH5501 FALL 99 CHANG 36 123
112 MATH2410 FALL 99 CHANG 36 123
158 MATH2410 SPRING 98 NULL 36 123
(6 row(s) affected)
5.4.5.2. Finding a range of characters
SQL Server allows some POSIX-compliant regular expression patterns in
LIKE clauses. We will illustrate some of these extensions for pattern matching.
LIKE can be used to find a range of characters. For example, to find all
grades between C and F in the Grade_report table, type:
SELECT DISTINCT student_number, grade
FROM Grade_report
WHERE grade LIKE '[c-f]'
AND student_number > 100
This query produces 15 rows of output:
student_number grade
-------------- -----
125 C
126 C
127 C
128 F
130 C
131 C
145 F
147 C
148 C
151 C
153 C
158 C
160 C
161 C
163 C
(15 row(s) affected)
 |
By default, note that LIKE is also case-insensitive.
You can change this, however, by changing SQL Server's database
configurations.
|
|
To find all grades from the Grade_report table that are not between C and F, we use a caret (^) before the range we
do not want to find:
SELECT DISTINCT student_number, grade
FROM Grade_report
WHERE grade LIKE '[^c-f]'
AND student_number > 100
This query produces the following 21 rows of output:
student_number grade
-------------- -----
121 B
122 B
123 A
123 B
125 A
125 B
126 A
126 B
127 A
127 B
129 A
129 B
132 B
142 A
143 B
144 B
146 B
147 B
148 B
155 B
157 B
(21 row(s) affected)
As another example, to find all the courses from the Section table that
start with "C," but do not have "h" as the second character, we could type:
SELECT *
FROM Section
WHERE course_num LIKE 'C[^h]%'
This query would give the following 10 rows of output:
SECTION_ID COURSE_NUM SEMESTER YEAR INSTRUCTOR BLDG ROOM
---------- ---------- -------- ---- ---------- ----- -----
90 COSC3380 SPRING 99 HARDESTY 79 179
91 COSC3701 FALL 98 NULL 79 179
92 COSC1310 FALL 98 ANDERSON 79 179
93 COSC1310 SPRING 99 RAFAELT 79 179
96 COSC2025 FALL 98 RAFAELT 79 179
98 COSC3380 FALL 99 HARDESTY 79 179
102 COSC3320 SPRING 99 KNUTH 79 179
119 COSC1310 FALL 99 ANDERSON 79 179
135 COSC3380 FALL 99 STONE 79 179
145 COSC1310 SPRING 99 JONES 79 179
(10 row(s) affected)
5.4.5.3. Finding a particular character
To find a particular character using LIKE, we would place the character in
square brackets []. For example, to find all the names from the Student table that
begin with a B or G and end in "ill," we could type:
SELECT sname
FROM Student
WHERE sname LIKE '[BG]ill'
We would get:
sname
--------------------
Bill
(1 row(s) affected)
5.4.5.4. Finding a single character or single digitthe
underscore wildcard character
A single character or digit can be found in a particular position in a string by
using an underscore, _, for the wildcard in that position in the string. For example, to
find all students with student_numbers in the 130s (130...139) range from the
Student table, type:
SELECT DISTINCT student_number, grade
FROM Grade_report
WHERE student_number LIKE '13_'
This query would produce the following:
student_number grade
-------------- -----
130 C
131 C
132 B
(3 row(s) affected)
5.4.5.5. Using NOT LIKE
In SQL Server , the LIKE operator can be negated with the NOT.
For example, to get a listing of the non math courses and the courses that do not start in "C" from
the Section table, we would type:
SELECT *
FROM Section
WHERE course_num NOT LIKE 'MATH%'
AND Course_num NOT LIKE 'C%'
This query would give the following 14 rows of output:
SECTION_ID COURSE_NUM SEMESTER YEAR INSTRUCTOR BLDG ROOM
---------- ---------- -------- ---- ---------- ------ ------
87 ENGL3401 FALL 98 HILLARY 13 101
88 ENGL3520 FALL 99 HILLARY 13 101
89 ENGL3520 SPRING 99 HILLARY 13 101
94 ACCT3464 FALL 98 RODRIGUEZ 74 NULL
95 ACCT2220 SPRING 99 RODRIQUEZ 74 NULL
97 ACCT3333 FALL 99 RODRIQUEZ 74 NULL
99 ENGL3401 FALL 99 HILLARY 13 101
100 POLY1201 FALL 99 SCHMIDT NULL NULL
101 POLY2103 SPRING 00 SCHMIDT NULL NULL
104 POLY4103 SPRING 00 SCHMIDT NULL NULL
126 ENGL1010 FALL 98 HERMANO 13 101
127 ENGL1011 SPRING 99 HERMANO 13 101
133 ENGL1010 FALL 99 HERMANO 13 101
134 ENGL1011 SPRING 00 HERMANO 13 101
(14 row(s) affected)
|