As part of an ongoing training session I am doing for some of my colleagues on the basics of TSQL, I wanted to store this content somewhere. Where better than on my blog.
So today we will be starting with looking at some basic string manipulation functions, how they work and why you might want to use them. Then we will look at how we can use them together to perform a common dynamic name separation technique. All of these examples are using a table which is constructed using the Microsoft AdventureWorks database (this version is using 2012). To build this table you will need to download a version of AdventureWorks and attach the database (alternatively you could use the RedGate Azure AdventureWorks – You can find that here).
Before we continue please run the following TSQL to build the table. There are many intentional errors in the data – I have hoped to simulate normal user activity and errors.
USE AdventureWorks2012; GO BEGIN TRY DROP TABLE StringMinipulationExample; END TRY BEGIN CATCH PRINT 'Table didnt exist' END CATCH GO CREATE TABLE StringMinipulationExample ( FullName Varchar(250) , Address VARCHAR(1000) ); INSERT INTO StringMinipulationExample SELECT ' ' + COALESCE(Title, '') + ' ' + FirstName + ' ' + LastName + ' ' AS 'FullName' , COALESCE(AddressLine1, '') + ' ' + COALESCE(AddressLine2, '') + ' ' + COALESCE(City, '') + ' ' + COALESCE(PostalCode, '') AS 'Address' FROM Person.BusinessEntityAddress INNER JOIN Person.Person ON Person.BusinessEntityID = BusinessEntityAddress.BusinessEntityID INNER JOIN Person.Address ON Address.AddressID = BusinessEntityAddress.AddressID SELECT FullName, Address FROM dbo.StringMinipulationExample
When you’re working with strings you will no doubt want to join one sting with another, this is known as concatenation. The process is a simple one. Take one string and combine it with another.
SELECT 'Hello' + ' ' + 'world!'
We are using a + to concatenate. You will see in the middle the ‘ ‘ this is to add a space between the two words. Sounds simple and it is, however it can quickly become confusing when you start to concatenate with NULLS or different data types. NULLS in SQL Server can cause some grief but they have a purpose. When you concatenate two strings you get a new string – Hello World! However when you concatenate a string and a NULL you get a NULL. to avoid this we need to handle those NULLS and replace them with something else. Remember ” <> NULL. An empty space will still concatenate as it is not NULL. To handle those NULLS we can use conditional logic (CASE WHEN ELSE END) or we can use ISNULL() or COALESCE(). Concatenation needs to be to the same data type. For example when you write 1+1 do you want 2 or 11? 1 + 1 = 2, ‘1’ + ‘1’ = ’11’. INT + INT = INT, CHAR + CHAR = CHARCHAR, INT + CHAR = ERROR. To avoid this error we need to first convert the INT to a CHAR using either CAST or CONVERT.
ISNULL & COALESCE
ISNULL ( check_expression , replacement_value )
COALESCE ( check_expression , replacement_value )
Both of these functions are NULL replacement functions. There are differences in how the engine processes and also on ISNULL is TSQL only and not ANSI standard.
SELECT ISNULL(NULL, '') SELECT COALESCE(NULL, '')
CAST ( expression AS data_type [ ( length ) ] )
CONVERT ( data_type [ ( length ) ] , expression [ , style ] )
Both of these functions help to change the data type in your query. A lot comes down to personal preference.
SELECT CAST( 1 AS CHAR(1)) SELECT CONVERT(CHAR(1), 1)
Both will return a char. You cannot convert all datatypes from one type to another. Refer to this image for reference.
LEFT & RIGHT
LEFT ( character_expression , integer_expression )
RIGHT ( character_expression , integer_expression )
When you want to take the left most character from a string. Character_expression is your string and integer_expression is how many characters do you want. This include blank spaces.
SELECT FullName, LEFT(FullName, 5) AS 'Left5_FullName' FROM dbo.StringMinipulationExample SELECT FullName, RIGHT(FullName, 5) AS 'Left5_FullName' FROM dbo.StringMinipulationExample
You can see that the left or right 5 characters have been selected. It might look odd as there are less then 5 characters visible on most rows and this is because there is preceeding and trailing spaces – keep reading to learn how to handle those.
SUBSTRING ( expression ,start , length )
When you want the middle x amount of characters from a string. In the example below I have used the programmers favourite Hello World!. Hello world! Has 12 characters including the space and the exclamation mark. If we want to take just the world we need to position our start on the w and take 5 spaces. The grey squares in the table below indicate which characters we are selecting. For those of you who are familiar with Excel this function is similar to the MID function.
SELECT 'Hello World!', SUBSTRING('Hello World', 7, 5)
UPPER & LOWER
UPPER ( character_expression )
LOWER ( character_expression )
When you want to cast a character as either uppercase or lower case. This function is similar to Ucase and Lcase in Excel and VB.
SELECT FullName, UPPER(FullName) AS 'Upper_FullName', LOWER(FullName) AS 'Lower_FullName' FROM dbo.StringMinipulationExample
CHARINDEX ( expressionToFind ,expressionToSearch [ , start_location ] )
When you want to know the position of a character in a string of characters. This is a really helpful function. If you think most names are FirstName SPACE LastName if you want to split the first and last name you can locate the space and take the LEFT and the SUBSTRING. The way this function is structured is a little backwards. We are looking at what we are trying to find, where we are trying to find it and how many character should be skipped before we search. The reason the latter is there is in the event that there are multiple spaces, if you know the first 2 characters are always a space you might want to add ,2 to start from the 3rd character.
SELECT FullName, CHARINDEX(' ', FullName, 3) FROM dbo.StringMinipulationExample
I have included a ,3 to the end of the function as I know I have 2 spaces at the start. I want to skip these. We will use a function to remove these in a moment. This function is similar to FIND() in Excel.
REPLACE ( string_expression , string_pattern , string_replacement )
When you want to replace a character or set of characters with a different character or set of characters. Useful when you need to find a character and replace it.
SELECT FullName, REPLACE(FullName, ' ', '_') FROM dbo.StringMinipulationExample
In this example I am replacing all the spaces with underscores.
REVERSE ( string_expression )
Does what is says on the tin! Reverses a string.
You might be asking well why would I want to do that? Because it is easy to find the first special character in a string, it is not as easy to find the last. How do you make the last the first? REVERSE it.
REVERSE(‘Hello World’) = ‘dlroW olleH’
RTIM & LTRIM
RTRIM ( character_expression )
LTRIM ( character_expression )
When you want to remove the trailing spaces from the left, right or both sides of a string.
SELECT FullName , LTRIM(FullName) AS 'LeftTrim' , RTRIM(FullName) AS 'RightTrim' , RTRIM(LTRIM(FullName)) AS 'BothTrim' FROM dbo.StringMinipulationExample
Combining the RTRIM and LTRIM allows you to remove chars from both sides.
Bringing it all together
So we have looked at a lot of the string manipulation function in SQL Server but to get the most out of these functions we need to using them together.
Lets look at how we can separate the Title, FirstName, LastName.
; WITH CleanedStrings AS ( SELECT RTRIM(LTRIM(FullName)) AS 'FullName' , FullName AS 'UncleanFullName' , Address FROM dbo.StringMinipulationExample ) SELECT CleanedStrings.UncleanFullName , CleanedStrings.FullName , REPLACE(LEFT(FullName, CHARINDEX('.', FullName)), '.', '') AS 'Title' , LEFT(LTRIM(REPLACE(CleanedStrings.FullName, LEFT(FullName, CHARINDEX('.', FullName)), '')), CHARINDEX(' ', LTRIM(REPLACE(CleanedStrings.FullName, LEFT(FullName, CHARINDEX('.', FullName)), '')))) AS 'FirstName' , REVERSE(LEFT(REVERSE(CleanedStrings.FullName), CHARINDEX(' ', REVERSE(CleanedStrings.FullName)))) AS 'LastName' FROM CleanedStrings;
First we have a CTE to remove the left and right padded strings. Then the title can be found as it always is followed by a ‘.’.
First name was a little tricky. I had to remove the title then find the space and separate. I could have found the first space and the last and use SUBSTRING.
For the LastName I have reversed the string, found where the first space is in the reversed string, taken a left of that and the reversed the whole statement back to how it should have been.
Thank you for reading