a multitude of uses for your Transact-SQL programming, allowing for string cleanup, conversion
between ASCII and regular characters, pattern searches, removing trailing blanks, and much more.
lists the different string functions available in SQL Server 2005.
String Functions
Function Name | Description |
---|---|
ASCII and CHAR | The ASCII function takes the leftmost character of a character expression and returns the ASCII code. The CHAR function converts an integer value for an ASCII code to a character value instead. |
CHARINDEX and PATINDEX | The CHARINDEX function is used to return the starting position of a string within another string. The PATINDEX function is similar to CHARINDEX, except that PATINDEX allows the use of wildcards when specifying the string for which to search. |
DIFFERENCE and SOUNDEX | The two functions DIFFERENCE and SOUNDEX both work with character strings to evaluate those that sound similar. SOUNDEX assigns a string a four-digit code, and DIFFERENCE evaluates the level of similarity between the SOUNDEX outputs for two separate strings. |
LEFT and RIGHT | The LEFT function returns a part of a character string, beginning at the specified number of characters from the left. The RIGHT function is like the LEFT function, only it returns a part of a character string beginning at the specified number of characters from the right. |
LEN and DATALENGTH | The LEN function returns the number of characters in a string expression, excluding any blanks after the last character (trailing blanks). DATALENGTH, on the other hand, returns the number of bytes used for an expression. |
LOWER and UPPER | The LOWER function returns a character expression in lowercase and the UPPER function returns a character expression in uppercase. |
LTRIM and RTRIM | The LTRIM function removes leading blanks and the RTRIM function removes trailing blanks. |
NCHAR and UNICODE | The UNICODE function returns the Unicode integer value for the first character of the character or input expression. The NCHAR function takes an integer value designating a Unicode character and converts it to its character equivalent. |
QUOTENAME | The QUOTENAME function adds delimiters to a Unicode input string in order to make it a valid delimited identifier. |
REPLACE | The REPLACE function replaces all instances of a provided string within a specified string, and replaces it with a new string. |
REPLICATE | The REPLICATE function repeats a given character expression a designated number of times. |
REVERSE | The REVERSE function takes a character expression and outputs the expression with each character position displayed in reverse order. |
SPACE | The SPACE function returns a string of repeated blank spaces, based on the integer you designate for the input parameter. |
STR | The STR function converts numeric data into character data. |
STUFF | The STUFF function deletes a specified length of characters and inserts a designated string at the specified starting point. |
SUBSTRING | The SUBSTRING function returns a defined chunk of a specified expression. |
Converting a Character Value to ASCII and Back to Character
The ASCII function takes the leftmost character of a character expression and returns the ASCII code, while the CHAR function converts an integer value for an ASCII code to a character value instead. Again, it should be stressed that ASCII only uses the first character of the string. If the string is empty or NULL, ASCII will return a NULL value (although a blank value returns 32). This first example demonstrates how to convert characters into the integer ASCII value:
SELECT ASCII('H'), ASCII('e'), ASCII('l'), ASCII('l'), ASCII('o')
This returns:
------------------------
72 101 108 108 111
Next, the CHAR function is used to convert the integer values back into characters again:
SELECT CHAR(72), CHAR(101), CHAR(108), CHAR(108), CHAR(111)
This returns:
-------------------
H e l l o
How It Works ?
In this recipe, the word “Hello” was deconstructed one character at a time and then converted into the numeric ASCII value, using the ASCII function. In the second T-SQL statement, the ASCII value was reversed back into character form using the CHAR function.
Returning Integer and Character Unicode Values
The UNICODE function returns the Unicode integer value for the first character of the character or input expression. The NCHAR function takes an integer value designating a Unicode character and converts it to its character equivalent. These functions are useful if you need to exchange data with external processes using the Unicode standard.
This first example converts single characters into an integer value representing the Unicode
standard character code:
SELECT UNICODE('G'), UNICODE('o'), UNICODE('o'), UNICODE('d'), UNICODE('!')
This returns:
------------------------
71 111 111 100 33
Next, the Unicode integer values are converted back into characters:
SELECT NCHAR(71), NCHAR(111), NCHAR(111), NCHAR(100), NCHAR(33)
This returns:
----------------------
G o o d !
How It Works ?
In this recipe, the word “Good!” was deconstructed one character at a time and then converted into an integer value using the UNICODE function. In the second T-SQL statement, the integer value was reversed back into character form using the NCHAR function.
Finding the Start Position of a String Within Another String
The CHARINDEX function is used to return the starting position of a string within another string. The
syntax is as follows:
CHARINDEX ( expression1 ,expression2 [ , start_location ] )
The expression1 argument is the string to be searched for. The expresssion2 argument is the
string in which you are searching. The optional start_location value indicates the character position
where you wish to begin looking.
This example demonstrates how to find the starting position of a string within another string:
SELECT CHARINDEX('String to Find', 'This is the bigger string to find something in.')
This returns:
-------------
20
How It Works ?
This function returned the starting character position, in this case the 20th character, where the first argument expression was found in the second expression. You can’t use wildcards with this function. Also, note that search matches are based on the rules of your SQL Server instance’s collation.
Finding the Start Position of a String Within Another String ?
Using Wildcards
The PATINDEX function is similar to CHARINDEX, except that PATINDEX allows the use of wildcards in
the string you are searching for. The syntax for PATINDEX is:
PATINDEX ( '%pattern%' ,expression )
PATINDEX returns the start position of the first occurrence of the search pattern, but unlike
CHARINDEX, it doesn’t have a starting position option.
In this example, rows are returned from Person.Address where AddressLine1 contains the word
fragment “olive”:
SELECT AddressID,
AddressLine1
FROM Person.Address
WHERE PATINDEX('%olive%', AddressLine1) > 0
This returns the following abridged results:
AddressID | AddressLine1
-------------------------------
29048 1201 | Olive Hill
11768 1201 | Olive Hill
15417 1206 | Olive St
24480 1480 | Oliveria Road
...
How It Works ?
This example returned any row where the AddressLine1 column contained the word “Olive.” With the wild card % on both the left and right of the word (without spaces between), the word “Olive” could also have been embedded within another word. The pattern can use different wildcard characters too. For a full review of wildcards, see Chapter 1 and the recipe “Using Wildcards with LIKE.”
Determining the Similarity of Strings ?
The two functions, DIFFERENCE and SOUNDEX, both work with character strings in order to evaluate
those that sound similar, based on English phonetic rules. SOUNDEX assigns a string a four-digit code,
and then DIFFERENCE evaluates the level of similarity between the SOUNDEX outputs for two separate
strings.
DIFFERENCE returns a value of zero to four, with four indicating the closest match in similarity.
This example demonstrates how to identify strings that sound similar—first by evaluating
strings individually, and then comparing them in pairs:
SELECT SOUNDEX('Fleas'),
SOUNDEX('Fleece'),
SOUNDEX('Peace'),
SOUNDEX('Peas')
This returns:
------------------------
F420 F420 P200 P200
Next, string pairs are compared using DIFFERENCE:
SELECT DIFFERENCE ( 'Fleas', 'Fleece')
This returns:
-----------------
4
Next, another string pair is compared:
SELECT DIFFERENCE ( 'Fleece', 'Peace')
This returns:
-----------
2
How It Works ?
In the first example, SOUNDEX was used to evaluate four similar sounding words. The query results
showed four codes, with “Fleas” and “Fleece” equal to F420, and “Peace” and “Peas” equal to P200.
In the second example, DIFFERENCE was used to evaluate “Fleas” and “Fleece” and “Fleece” and
“Peace.” “Fleas” and “Fleece” were shown to be more similar with a value of 4 than “Fleece” and
“Peace” which had a comparison value of 2.
Taking the Leftmost or Rightmost Part of a String
The LEFT function returns a part of a character string, beginning at the specified number of characters
from the left. The RIGHT function is like the LEFT function, only it returns a part of a character string
beginning at the specified number of characters from the right.
This recipe demonstrates how to return a subset of the leftmost and rightmost parts of a string.
Also, a common string padding trick is demonstrated using these functions.
In the first example, the leftmost 10 characters are taken from a string:
SELECT LEFT('I only want the leftmost 10 characters.', 10)
This returns:
-------------
I only wan
Next, the rightmost characters of a string:
SELECT RIGHT('I only want the rightmost 10 characters.', 10)
This returns:
---------------
haracters.
This next example demonstrates zero-padding the ListPrice column’s value:
-- Padding a number for business purposes
SELECT TOP 3
ProductID, RIGHT('0000000000' + CONVERT(varchar(10), ListPrice),10)
FROM Production.Product
WHERE ListPrice > 0
This returns:
---------------------------------
ProductID | (No column name)
--------------------------------
514 | 0000133.34
515 | 0000147.14
516 | 0000196.92
How It Works ?
This recipe demonstrated three examples of using LEFT and RIGHT. The first two examples demonstrated
returning the leftmost or the rightmost characters of a string value. The third example demonstrated the padding of a string in order to conform to some expected business format. When presenting data to end-users or exporting data to external systems, you may sometimes need to preserve or add leading values, such as leading zeros to fixed length numbers. ListPrice was zero-padded by first concatenating ten zeros in a string to the converted varchar(10) value of the ListPrice. Then, outside of this concatenation, RIGHT was used to grab the last 10 characters
of the concatenated string (thus taking leading zeros from the left side with it, when the ListPrice
fell short of ten digits):
RIGHT('0000000000' + CONVERT(varchar(10), ListPrice),10)
Determining the Number of Characters or Bytes in a String
The LEN function returns the number of characters in a string expression, excluding any blanks after
the last character (trailing blanks). DATALENGTH, on the other hand, returns the number of bytes used
for an expression. In this recipe, I’ll demonstrate how to measure the number of characters and
bytes in a string.
This first example returns the number of characters in the string:
SELECT LEN(N'She sells sea shells by the sea shore.')
This returns:
-------
38
This next example returns the number of bytes in the string.
SELECT DATALENGTH(N'She sells sea shells by the sea shore.')
This returns:
------------
76
How It Works ?
This recipe used a Unicode string, which is defined by prefixing the string with an N as follows:
N'She sells sea shells by the sea shore.' The number of characters for this string is 38 according to LEN, but since it is a Unicode string, DATALENGTH returns 76 bytes. Unicode data takes two bytes for each character, whereas non-Unicode takes only one.
Replacing a Part of a String with Another String
The REPLACE function replaces all instances of a provided string within a specified string, and replaces it with a new string. One real strength of REPLACE is, unlike PATINDEX and CHARINDEX which return a specific location where a pattern is found, REPLACE can find multiple instances of a pattern within a specific character string.
The syntax for REPLACE is:
REPLACE ( 'string_expression1' , 'string_expression2' , 'string_expression3' )
The first string expression argument is the string that will be modified. The second string expression is the string to be removed from the first string argument. The third string expression is the string to insert into the first argument.
This example demonstrates how to replace all instances of a provided string with a new string:
SELECT REPLACE('Zenon is our major profit center. Zenon leads the way.','Zenon','Xerxes')
This returns:
---------------------------------------------------------
Xerxes is our major profit center. Xerxes leads the way.
How It Works ?
In this recipe, the first string expression was the string to be searched, “Zenon is our major profit center. Zenon leads the way.” The second expression was the expression to replace (“Zenon”), and the third expression was the value to substitute “Zenon” with “Xerxes.”
Stuffing a String into a String
The STUFF function deletes a specified length of characters and inserts a designated string at the
specified starting point. The syntax is:
STUFF ( character_expression, start, length, character_expression )
The first argument of this function is the character expression to be modified. The second
argument is the starting position of the inserted string. The length is the number of characters to
delete within the character expression. The fourth argument is the actual character expression that
you want to insert.
This example replaces a part of a string and inserts a new expression into the string body:
SELECT STUFF ( 'My cat's name is X. Have you met him?',18,1,'Edgar' )
This returns:
----------------------------------------------------
My cat's name is Edgar. Have you met him?
How It Works
The character expression in this recipe was “My cat’s name is X’ING Have you met him?”. The start value was 18, which means that the replacement will occur at the 18th position within the string (which is X, in this case). The length value was 1, meaning only one character at position 18 would be deleted. The last character expression was Edgar which is the value to stuff into the string.
Changing Character Values to Lower,Upper, and Proper Case
The LOWER function returns a character expression in lowercase and the UPPER function returns a character expression in uppercase. There isn’t a built-in proper case function, so a user-defined function will be demonstrated in this recipe instead.
Before showing the different functions in action, the following query I’ve presented will show
the value of DocumentSummary for a specific row in the Production.Document table:
SELECT DocumentSummary
FROM Production.Document
WHERE DocumentID = 4
This returns the following sentence case value:
----------------------------------------------------------------------------------------------
Detailed instructions for replacing pedals with Adventure Works Cycles replacement
pedals. Instructions are applicable to all Adventure Works Cycles bicycle models and
replacement pedals. Use only Adventure Works Cycles parts when replacing worn or broken
components.
This first example demonstrates setting values to lowercase:
SELECT LOWER(DocumentSummary)
FROM Production.Document
WHERE DocumentID = 4
This returns:
---------------------------------------------------------------------------------------
detailed instructions for replacing pedals with adventure works cycles replacement
pedals. instructions are applicable to all adventure works cycles bicycle models and
replacement pedals. use only adventure works cycles parts when replacing worn or broken
components.
Now for uppercase:
SELECT UPPER(DocumentSummary)
FROM Production.Document
WHERE DocumentID = 4
This returns:
----------------------------------------------------------------------------------------------------------
DETAILED INSTRUCTIONS FOR REPLACING PEDALS WITH ADVENTURE WORKS CYCLES REPLACEMENT PEDALS. INSTRUCTIONS ARE APPLICABLE TO ALL ADVENTURE WORKS CYCLES BICYCLE MODELS AND REPLACEMENT PEDALS. USE ONLY ADVENTURE WORKS CYCLES PARTS WHEN REPLACING WORN OR BROKEN COMPONENTS.
In order to set a string to proper case (capitalizing the letter of each word in the string), a userdefined
function can be used
CREATE FUNCTION udf_ProperCase(@UnCased varchar(max))
RETURNS varchar(max)
AS
BEGIN
SET @UnCased = LOWER(@UnCased)
DECLARE @C int
SET @C = ASCII('a')
WHILE @C <= ASCII('z')
BEGIN
SET @UnCased = REPLACE( @UnCased, ' ' + CHAR(@C), ' ' + CHAR(@C-32))
SET @C = @C + 1
END
SET @UnCased = CHAR(ASCII(LEFT(@UnCased, 1))-32) + RIGHT(@UnCased,
LEN(@UnCased)-1)
RETURN @UnCased
END
GO
Once the user-defined function is created, the string to modify to proper case can be used as
the function parameter:
SELECT dbo.udf_ProperCase(DocumentSummary)
FROM Production.Document
WHERE DocumentID = 4
This returns:
------------------------------------------------------------------------------------------------------
Detailed Instructions For Replacing Pedals With Adventure Works Cycles Replacement
Pedals. Instructions Are Applicable To All Adventure Works Cycles Bicycle Models And
Replacement Pedals. Use Only Adventure Works Cycles Parts When Replacing Worn Or Broken
Components.
How It Works ?
The first example demonstrated the LOWER function, which returned a character expression in lowercase.
The second example demonstrated the UPPER function, which returned a character expression
in uppercase.
There isn’t a built-in proper case function, so a user-defined function was created in this recipe
instead. The first line of the CREATE FUNCTION definition defines the name and parameter expected—in
this case a varchar(max) data type parameter:
CREATE FUNCTION udf_ProperCase(@UnCased varchar(max))
The RETURNS keyword defined what data type would be returned by the function after the logic
has been applied:
RETURNS varchar(max)
AS
BEGIN
Next, the variable passed to the function was first modified to lowercase using the LOWER function:
SET @UnCased = LOWER(@UnCased)
A new integer local variable @C was set to the ASCII value of the letter ‘a’:
DECLARE @C int
SET @C = ASCII('a')
A WHILE loop was initiated to go through every letter in the alphabet, and for each, search for
a space preceding that letter, and then replace each occurrence of a letter preceded by a space with
the uppercase version of the character:
WHILE @C <= ASCII('z')
BEGIN
SET @UnCased = REPLACE( @UnCased, ' ' + CHAR(@C), ' ' + CHAR(@C-32))
SET @C = @C + 1
END
The conversion to uppercase is performed by subtracting 32 from the ASCII integer value of the
lowercase character. For example, the ASCII value for a lowercase “a” is 97, while the uppercase A is 65.
SET @UnCased = CHAR(ASCII(LEFT(@UnCased, 1))-32) + RIGHT(@UnCased, LEN(@UnCased)-1)
The final proper case string value of @UnCased is then returned from the function:
RETURN @UnCased
END
GO
Removing Leading and Trailing Blanks
The LTRIM function removes leading blanks and the RTRIM function removes trailing blanks. This first example demonstrates removing leading blanks from a string:
SELECT LTRIM(' String with leading blanks.')
This returns:
---------------------------------------------------
String with leading blanks.
This second example demonstrates removing trailing blanks from a string:
SELECT RTRIM('"' + 'String with trailing blanks ') + '"'
This returns:
--------------------------------------
"String with trailing blanks"
How It Works ?
Both LTRIM and RTRIM take a single argument—a character expression that trims the leading or trailing
blanks. Note that there isn’t a TRIM function (as seen in other programming languages) that can
be used to remove both leading and trailing characters. To do this, you must use both LTRIM and
RTRIM in the same expression.
Repeating an Expression N Number of Times
The REPLICATE function repeats a given character expression a designated number of times.
The syntax is:
REPLICATE ( character_expression ,integer_expression )
The first argument is the character expression to be repeated. The second argument is the integer
value of the number of times the character expression is to be repeated.
This example demonstrates how to use the REPLICATE function to repeat a character expression
a set number of times:
SELECT REPLICATE ('Z', 30)
This returns:
----------------------------------------------------
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
How It Works ?
In this recipe’s example, the letter Z in the character expression was repeated 30 times. Use REPLICATE
to repeat values rather than having to code the characters manually. The maximum return value is 8,000 bytes.
Repeating a Blank Space N Number of Times
The SPACE function returns a string of repeated blank spaces, based on the integer you designate for the input parameter.
This example demonstrates how to repeat a blank space a defined number of times:
SELECT 'Give me some' + SPACE(6) + 'space.'
This returns:
-------------------------------------------
Give me some space.
How It Works ?
In this recipe, six blank spaces were concatenated in the middle of two strings. The maximum return value is 8,000 bytes.
Outputting an Expression in Reverse Order
The REVERSE function takes a character expression, and outputs the expression with each character position displayed in reverse order. This example demonstrates how to reverse a string expression:
SELECT TOP 1
GroupName,
REVERSE(GroupName) GroupNameReversed
FROM HumanResources.Department
ORDER BY GroupName
This returns:
------------------------------------------
GroupName GroupNameReversed
Executive General and Administration noitartsinimdA dna lareneG evitucexE
How It Works ?
This recipe demonstrated using the REVERSE function to output a string’s characters in reverse order.
Returning a Chunk of an Expression
The SUBSTRING function returns a defined chunk of a specified expression.
The syntax is as follows:
SUBSTRING ( expression, start, length )
The first argument of this function is the character expression that you should use to return
a defined chunk. The second argument defines the character starting position of the chunk. The
third argument is the length of the character chunk that you want to extract.
In this example, assume your application receives a bank account number from a customer. It
is your company’s policy to only store a masked representation of the bank number, retaining the
middle four digits only:
DECLARE @BankAccountNumber char(14)
SET @BankAccountNumber = '1424-2342-3536'
SELECT 'XXXX-' + SUBSTRING(@BankAccountNumber, 6,4) + '-XXXX' Masked_BankAccountNumber
This returns:
----------------------------------------
Masked_BankAccountNumber
XXXX-2342-XXXX
How It Works ?
In this recipe, the SUBSTRING function was used to get the middle four digits from a longer bank
account number. The expression in the SUBSTRING function call was the bank account number, followed
by the starting position, and then the number of characters to extract.
No comments:
Post a Comment