Sun Modelling for Data Warehouses. Visio template


Sun modelling is a technique for requirement gathering for a multidimensional data warehouse modelling. I have used or read about many different DW modelling techniques, however none is quite as simple and elegant as Sun Modelling. Sun Modelling was designed by Prof Mark Whitehorn (Dundee University). I won’t go into the details of how you execute a sun model today, however I will write about this in the future.

For the moment I have tried to simplify the process of building a sun model in visio. You can download a visio stencil which contains the main elements required to make a Sun Model. Also included is a sample visio document.

Download it here.


Exeter’s SQL South West UG – Scripts and slides


Thanks for the massive turnout last night at SQL South West, you guys rule!

I presented a session called TSQL(L): Tricky SQL, Quandaries and Lessons Learnt.

Thanks again for attending. I had a lot of fun with this session. In the session we talked about a few websites:

Ask SQL Server Central:
Gaps & Islands:

See you all next month.

You can download the solution file Download scripts.


Tally/Number tables in SQL Server #TSQL2sday


SqlTuesday As part of #TSQL2sday Mickey Stuewe (b|t) has proposed a question looking at best practices when someone has made some bad design decisions, generally when I am facing bad design problems I am the one who put them there (:P). This tip is not necessarily a “how to fix an issue” but more a tip on how to work with dates.

As a BI developer I am always working with dates and numbers. A user will come and ask for a query relating to what happened last month – What they typically mean is what happened per day? How much did we sell on each day? How many customer visited our website each day? How many members of my team were absent in October?

When you’re working with HR data you commonly want to see where you have absences over a period of time. This might be a chart per day or per month. So this is what we want to do. We want to see each day for October how many people were absent on each day – and here we start to see a common issue. This is also an issue that might be neglected where you commonly have sales per day, when you have no sales on an edge-case day it might get missed, and you really don’t want to miss that!

Here is some sample data to illustrate the point. The sample data is my team and their absences. We have Timmy and Bobby both have had a bad month and have been ill on multiple occasions (man-flu), they have a few consecutive periods of illness and the odd one day. What I want to see is 31 rows (one for every day in October) returned showing any zeros where there were no absences. However our HR system only logs out a row for data logged and not a row for those where there has been nothing logged. So a query like this (which is what the end-user asked for) looks like this:

DECLARE @Example TABLE (ID INT IDENTITY, StaffMember VARCHAR(50), AbsenceDate DATE, AbsenceReason VARCHAR(50))
INSERT INTO @Example ( StaffMember, AbsenceDate, AbsenceReason )
('Bobby', '20151001', 'Sickness')
,('Bobby', '20151002', 'Sickness')
,('Bobby', '20151003', 'Sickness')
,('Bobby', '20151009', 'Sickness')
,('Bobby', '20151022', 'Sickness')
,('Bobby', '20151021', 'Sickness')
,('Bobby', '20151023', 'Sickness')
,('Bobby', '20151024', 'Sickness')
,('Sammy', '20151001', 'Sickness')
,('Sammy', '20151003', 'Sickness')
,('Sammy', '20151022', 'Sickness')
,('Sammy', '20151025', 'Sickness')

When you run the following:

SELECT AbsenceDate, COUNT(*) AS 'AbsenceCount' FROM @Example WHERE AbsenceDate >= '20151001' AND AbsenceDate < '20151101' GROUP BY AbsenceDate


But we are missing most of the days. If this data is to be used for a visualisation in Excel or SSRS then we want to see this per day, however we will only have pointers for those where the data has been logged, and that is an issue. But that isn’t at the fault of the developers, why would you write out more than you need? It doesn’t make sense. Ok we have an issue, but it doesn’t stop there! Our user now wants to know “what days were there no absences?”. How can we look for data that does not exist?

We obviously need something to help. We need a tally table. A tally table or commonly known as a numbers table is a simple table which holds numbers or dates. In a data warehouse you might have a date dimension – this is similar. A tally is a table that has a load of numbers in it. 1 to 100 million or however many you need, it is up to you. For this example we want one row in a table for every date. The script below will build you a table and insert as many rows as you want. For this example I have used 365 days from 01-01-2015 this will gives a row for every day in 2015. Alter the @j variable to extend this and alter the @startDate to move the start date.

After running this script you will have a full table and now we have something that we can use to start answering the questions we have been asked. So lets look at our questions again.

Question number one. Create a list of all absences in October (including where we have no absences)

; WITH TallyDates AS
(SELECT * FROM dbo.TallyTbl WHERE TallyDate &gt;= '20151001' AND TallyDate &lt; '20151101')
, AbsenceData AS 
(SELECT AbsenceDate, COUNT(*) AS 'AbsenceCount' FROM @Example GROUP BY AbsenceDate)
SELECT TallyDates.TallyDate, COALESCE(AbsenceData.AbsenceCount,0) AS 'AbsenceCount' FROM TallyDates
LEFT JOIN AbsenceData ON TallyDates.TallyDate = AbsenceData.AbsenceDate


Now we have a full list of all the days including those where we have had no absences. Ideal!

Question number two: In October what dates were no absences? An to make it a little harder, which days were working days – wondering why there was an TallyWeekend column? It was for this.

; WITH TallyDates AS
(SELECT TallyDate, TallyDateWeekend FROM dbo.TallyTbl WHERE TallyDate &gt;= '20151001' AND TallyDate &lt; '20151101')
SELECT TallyDates.TallyDate AS 'DateWithNoAbsence' FROM TallyDates WHERE TallyDates.TallyDate NOT IN (SELECT AbsenceDate FROM @Example)
AND TallyDates.TallyDateWeekend = 0


So something that seemed really difficult before is now really simple. So not a bad design solution but a way to manage dates when you need to look for missing dates.

If you do not want to make a tally table or you cannot alter the database then you can do this all in a different way. Itzik Ben-Gan developed a little script that can be used to generate millions or rows in a zero IO. You can read about this here:



SQL Server string manipulation functions


Hi everyone,

As part of an ongoing training session I am doing for some of my colleagues on the basics of TSQL, I wanted to store this content somewhere. Where better than on my blog.

So today we will be starting with looking at some basic string manipulation functions, how they work and why you might want to use them. Then we will look at how we can use them together to perform a common dynamic name separation technique. All of these examples are using a table which is constructed using the Microsoft AdventureWorks database (this version is using 2012). To build this table you will need to download a version of AdventureWorks and attach the database (alternatively you could use the RedGate Azure AdventureWorks – You can find that here).

Before we continue please run the following TSQL to build the table. There are many intentional errors in the data – I have hoped to simulate normal user activity and errors.

USE AdventureWorks2012;
DROP TABLE StringMinipulationExample;
	PRINT 'Table didnt exist'
CREATE TABLE StringMinipulationExample
FullName Varchar(250)
, Address VARCHAR(1000)
INSERT INTO StringMinipulationExample
	' ' + COALESCE(Title, '') + ' ' + FirstName + ' ' + LastName + '   ' AS 'FullName'
	, COALESCE(AddressLine1, '') + ' ' + COALESCE(AddressLine2, '') + ' ' + COALESCE(City, '') + ' ' + COALESCE(PostalCode, '') AS 'Address'
FROM Person.BusinessEntityAddress 
INNER JOIN Person.Person ON	Person.BusinessEntityID = BusinessEntityAddress.BusinessEntityID
INNER JOIN Person.Address ON Address.AddressID = BusinessEntityAddress.AddressID
SELECT FullName, Address FROM dbo.StringMinipulationExample


When you’re working with strings you will no doubt want to join one sting with another, this is known as concatenation. The process is a simple one. Take one string and combine it with another.

SELECT 'Hello' + ' ' + 'world!'

We are using a + to concatenate. You will see in the middle the ‘ ‘ this is to add a space between the two words. Sounds simple and it is, however it can quickly become confusing when you start to concatenate with NULLS or different data types. NULLS in SQL Server can cause some grief but they have a purpose. When you concatenate two strings you get a new string – Hello World! However when you concatenate a string and a NULL you get a NULL. to avoid this we need to handle those NULLS and replace them with something else. Remember ” <> NULL. An empty space will still concatenate as it is not NULL. To handle those NULLS we can use conditional logic (CASE WHEN ELSE END) or we can use ISNULL() or COALESCE(). Concatenation needs to be to the same data type. For example when you write 1+1 do you want 2 or 11? 1 + 1 = 2, ‘1’ + ‘1’ = ’11’. INT + INT = INT, CHAR + CHAR = CHARCHAR, INT + CHAR = ERROR. To avoid this error we need to first convert the INT to a CHAR using either CAST or CONVERT.


ISNULL ( check_expression , replacement_value )

COALESCE ( check_expression , replacement_value )

Both of these functions are NULL replacement functions. There are differences in how the engine processes and also on ISNULL is TSQL only and not ANSI standard.



CAST ( expression AS data_type [ ( length ) ] )

CONVERT ( data_type [ ( length ) ] , expression [ , style ] )

Both of these functions help to change the data type in your query. A lot comes down to personal preference.


Both will return a char. You cannot convert all datatypes from one type to another. Refer to this image for reference.


LEFT ( character_expression , integer_expression )

RIGHT ( character_expression , integer_expression )

When you want to take the left most character from a string. Character_expression is your string and integer_expression is how many characters do you want. This include blank spaces.

SELECT FullName, LEFT(FullName, 5) AS 'Left5_FullName' FROM dbo.StringMinipulationExample
SELECT FullName, RIGHT(FullName, 5) AS 'Left5_FullName' FROM dbo.StringMinipulationExample

You can see that the left or right 5 characters have been selected. It might look odd as there are less then 5 characters visible on most rows and this is because there is preceeding and trailing spaces – keep reading to learn how to handle those.


SUBSTRING ( expression ,start , length )

When you want the middle x amount of characters from a string. In the example below I have used the programmers favourite Hello World!. Hello world! Has 12 characters including the space and the exclamation mark. If we want to take just the world we need to position our start on the w and take 5 spaces. The grey squares in the table below indicate which characters we are selecting. For those of you who are familiar with Excel this function is similar to the MID function.

SELECT 'Hello World!', SUBSTRING('Hello World', 7, 5)


1 2 3 4 5 6 7 8 9 10 11 12
H e l l o W o r l d !



UPPER ( character_expression )

LOWER ( character_expression )

When you want to cast a character as either uppercase or lower case. This function is similar to Ucase and Lcase in Excel and VB.

SELECT FullName, UPPER(FullName) AS 'Upper_FullName', LOWER(FullName) AS 'Lower_FullName' FROM dbo.StringMinipulationExample


CHARINDEX ( expressionToFind ,expressionToSearch [ , start_location ] )

When you want to know the position of a character in a string of characters. This is a really helpful function. If you think most names are FirstName SPACE LastName if you want to split the first and last name you can locate the space and take the LEFT and the SUBSTRING. The way this function is structured is a little backwards. We are looking at what we are trying to find, where we are trying to find it and how many character should be skipped before we search. The reason the latter is there is in the event that there are multiple spaces, if you know the first 2 characters are always a space you might want to add ,2 to start from the 3rd character.

SELECT FullName, CHARINDEX(' ', FullName, 3) FROM dbo.StringMinipulationExample

I have included a ,3 to the end of the function as I know I have 2 spaces at the start. I want to skip these. We will use a function to remove these in a moment. This function is similar to FIND() in Excel.


REPLACE ( string_expression , string_pattern , string_replacement )

When you want to replace a character or set of characters with a different character or set of characters. Useful when you need to find a character and replace it.

SELECT FullName, REPLACE(FullName, ' ', '_') FROM dbo.StringMinipulationExample

In this example I am replacing all the spaces with underscores.


REVERSE ( string_expression )

Does what is says on the tin! Reverses a string.
You might be asking well why would I want to do that? Because it is easy to find the first special character in a string, it is not as easy to find the last. How do you make the last the first? REVERSE it.

REVERSE('Hello World')

REVERSE(‘Hello World’) = ‘dlroW olleH’


RTRIM ( character_expression )

LTRIM ( character_expression )

When you want to remove the trailing spaces from the left, right or both sides of a string.

	, LTRIM(FullName) AS 'LeftTrim'
	, RTRIM(FullName) AS 'RightTrim'
	, RTRIM(LTRIM(FullName)) AS 'BothTrim'
FROM dbo.StringMinipulationExample

Combining the RTRIM and LTRIM allows you to remove chars from both sides.

Bringing it all together

So we have looked at a lot of the string manipulation function in SQL Server but to get the most out of these functions we need to using them together.
Lets look at how we can separate the Title, FirstName, LastName.

; WITH CleanedStrings AS 
	RTRIM(LTRIM(FullName)) AS 'FullName'
	, FullName AS 'UncleanFullName'
	, Address
FROM dbo.StringMinipulationExample
	, CleanedStrings.FullName 
	, REPLACE(LEFT(FullName, CHARINDEX('.', FullName)), '.', '') AS 'Title'
	, LEFT(LTRIM(REPLACE(CleanedStrings.FullName, LEFT(FullName, CHARINDEX('.', FullName)), '')), CHARINDEX(' ', LTRIM(REPLACE(CleanedStrings.FullName, LEFT(FullName, CHARINDEX('.', FullName)), '')))) AS 'FirstName'
	, REVERSE(LEFT(REVERSE(CleanedStrings.FullName), CHARINDEX(' ', REVERSE(CleanedStrings.FullName)))) AS 'LastName'
FROM CleanedStrings;

First we have a CTE to remove the left and right padded strings. Then the title can be found as it always is followed by a ‘.’.
First name was a little tricky. I had to remove the title then find the space and separate. I could have found the first space and the last and use SUBSTRING.

For the LastName I have reversed the string, found where the first space is in the reversed string, taken a left of that and the reversed the whole statement back to how it should have been.

Thank you for reading

SSRSDB – Automated contact with users when reports are not used



Hi everyone,

This is the 4th part of the SSRSDB blog series. In this part we will be looking at how we can use data that we are collecting to automate contact to our users. Have you ever been asked to create a new dashboard that replace current reports, then asked if you can remove those replaced reports and been told “no, we still need those”. This happens a lot. The problem it causes is repetition of reporting, when there is repetition it can lead to multiple reports with slightly different business rules, this leads to confusion and misleading information.

How do we avoid this happening? One method is to analyse how frequency of the report use and compare it to a predefined limit of usage. In this section I want to look at this approach.

In the last section we looked at the additional metadata which holds additional data to support monitoring the report, this includes who owns the report and their email address. The metadata also contains the following additional columns:

Inactive Period – The amount of days that is allowed to elapse without usage. This needs to be set per report, the reason for this is that you will likely have some reports that are run daily and other that are only ran one every quarter.

Inactive Action – There are 2 outcomes at the stage:

NOTHING – Do Nothing
EMAIL – Send a friendly email to the user.

Final Period – The amount of days that can elapse for the final action.

Final Action – This is the action that takes place once the inactive period has elapsed. At current you’re limited to the following actions:

NOTHING – Do nothing
HIDE – Hide the report*
DELETE – Trigger the ReportServer to delete the report*

*Microsoft do not want you to do this, this is making a change to the ReportServer without using the ReportServer, this is a big no-no. However the process to hide and the process to delete is triggering the same stored procedures that the ReportServer GUI uses.

A typical set-up might be to set a report to email after 7 days, the report is then still available to users if they want to run the report, after a further 7 days the report is set to either hide or it is deleted. If you’re working in an environment that does not source control your reports then you can still get a version of the report back from dim_catalog in the event it is deleted in error – On a side note you should be source controlling your reports, use an SVN and source control solution.

With this data we can start looking at a process to automate contact with our users. Ok let’s look at the SQL to do this. What we have is a query to check if a user needs to be contacted and then we are going to loop over these with a cursor and send them a friendly email to say “Hey! You are not using the report I worked on!”.

To allow this to work you need to have configured database mail on your server. You can find a short guide on configuring database mail using this link – Once you have this configured then you can apply the following code to contact your users*.

*If you have not updated your metadata then this will not make contact.


Notify the user

DECLARE DeprecatedReportsCursor CURSOR FOR 
WITH LastExecution AS 
			, MAX([ExecutionStartDate]) maxdate
	, M.Item_Owner
	, CASE	WHEN maxdate IS NULL THEN 999
	END AS 'Days_Since_Last_Execution'
	,CASE WHEN (m.Item_Final_Period - DATEDIFF(DAY,MAXDATE,GETDATE())) < 0 THEN 0 ELSE m.Item_Final_Period - DATEDIFF(DAY,MAXDATE,GETDATE()) END AS 'Days_To_Remove'
	,M.Item_Owner_Email AS 'Email_Address'
	dbo.Dim_Catalog C
	LEFT JOIN LastExecution LE ON LE.[item_skey] = C.item_Skey
	LEFT JOIN Support.Item_Metadata M ON C.Item_BusinessKey = M.Item_BusinessKey
	c.SCD_Is_Current = 1
	AND c.Item_Type = 2	
	AND c.Item_Is_Hidden = 0
	AND M.Item_Type = 'Main'
	AND M.Item_In_Active_Action = 'EMAIL'
DECLARE @Date varchar(100)
DECLARE @MainBody varchar(MAX)
DECLARE @Link varchar(500)
DECLARE @ReportName VARCHAR(500)
DECLARE @ExecutionDate VARCHAR(500)
DECLARE @ReportRequestedBy VARCHAR(500)
DECLARE @InActivePeriod VARCHAR(5) 
DECLARE @EmailSubject VARCHAR(500)
DECLARE @OwnerEmailAddress VARCHAR(100)
OPEN DeprecatedReportsCursor
FETCH NEXT FROM DeprecatedReportsCursor INTO @ReportName, @ExecutionDate, @ReportRequestedBy, @DSE, @Link,@InActivePeriod,@DaysToRemove, @OwnerEmailAddress
SET @MainBody = '
	<title>Notification Email</title>
	<table width="800" border="0" cellpadding="0" cellspacing="0" bgcolor="#ffffff" style="font-size: 13">
			<tr bgcolor="#008a9f" width="734px"><td><font color="#008a9f">h</font></td></tr>
			<tr bgcolor="#00c3e1" width="734px"> <td><font color="#00c3e1">h</font></td> </tr>
			<tr bgcolor="#FFFFFF" width="734px"> <td><font color="#FFFFFF">h</font></td> </tr>
			<tr valign="top" align="center">
				<td style="border-width : 0px;"><img src="" alt="" title="" border="0" height="150" />
			<tr bgcolor="#FFFFFF" width="734px"> <td><font color="#FFFFFF">h</font></td> </tr>
			<tr valign="top">
				<td style="border-width : 0px; font-family:Arial;">
					<h1 style="color:#E33030; text-align: center; ">Report due for depreciation. <u>Act now!</u></h1>
					<p><b>' + @ReportName + '</b> - Has not been used for more than ' + @InActivePeriod + ' days. </p>
					<p>This report is now scheduled for depreciation and will be removed <strong>' + CASE WHEN @DaysToRemove = 0 THEN 'Today' ELSE  'in ' + @DaysToRemove + ' days' END + '</strong>.</p>
					Please click the link below to avoid this action.  <br />
					<a href="' + @Link + '">' + @ReportName + ' </a>  <br />
					<p><i>Management Information Team.</i></p>
					<tr bgcolor="#ffffff" width="734px"> <td><font color="#ffffff">h</font></td> </tr>
					<tr bgcolor="#00c3e1" width="734px"> <td><font color="#00c3e1">h</font></td> </tr>
					<tr bgcolor="#008a9f" width="734px"><td><font color="#008a9f">h</font></td></tr>
					<tr bgcolor="#000000"><td style="padding: 10px; font-family:Arial"><font size="1" color="#FFFFFF">
					<p style="font-size: 12; font-family:Arial"></br>
					Please send any feedback to <a style="text-decoration: none; color: #FFFFFF" href=""></a>
					<p style="font-size: 12; font-family:Arial">
					This email was auto generated. Please do not reply directly. 
SET @EmailSubject = 'Important: Report due for depreciation - ' + @ReportName;
SET @ReportRequestedBy = @OwnerEmailAddress
		EXEC msdb.dbo.sp_send_dbmail
		@profile_name = 'hyperbi',
		@recipients = @OwnerEmailAddress,
		@subject = @EmailSubject,
		@body = @MainBody,
		@body_format = 'HTML'
		PRINT @MainBody
		FETCH NEXT FROM DeprecatedReportsCursor INTO @ReportName, @ExecutionDate, @ReportRequestedBy, @DSE, @Link,@InActivePeriod,@DaysToRemove, @OwnerEmailAddress
CLOSE DeprecatedReportsCursor	
DEALLOCATE DeprecatedReportsCursor

Working our way from the top we have a common table expressions which is calculating when a report was last accessed (successfully – non-successful runs are not stored in the ExectionStorageLog table). This is fed in to a query which is comparing the report with the metadata to calculate if the user should be contacted by email – this must be expressed in the metadata. This is all loaded in to the a CURSOR statement (typically I would say that CURSORs are evil and should be avoided at all costs, however much like all questions related to database administration it depends, looping to send emails is a great reason to use a CURSOR).

From here we load a set of variables from the cursor, these variables are loaded in to an HTML block which is used for the emails. You will notice that I have used quite an old style of HTML, this isnt because I am old school, but because these emails are typically going to Outlook, which has very basic HTML processing. Once the HMTL is built it is being passed over to database mail, EXEC msdb.sp_send_dbmail handles this for us. Once executed this will fire an email for every report which has not been accessed beyond the in active period. The user will receive an email that looks like this:


This email is completely customisable with a little knowledge of HTML. It features your company’s logo (1), a “call to action” prompting the user that they need to decide what to do with their report, all the details about their report are listed (3), a link to the report is included (4)(this link allows the user to click on the link and run the report, running the report will reset the counter) and lastly it contains all the contact information for your report writing team.

Personally I have an agent job scheduled to run once a week on a Monday morning.


Automatic expiration.

Sending an email is all well and good however what happens if no one responds, or the only person who used that report left the company months ago. We need an automated solution to remove/hide these based on the final action. To do that we have another agent job.

DECLARE DeprecatedReportsCursor CURSOR FOR 
WITH LastExecution AS 
			, MAX([ExecutionStartDate]) maxdate
	dbo.Dim_Catalog C
	LEFT JOIN LastExecution LE ON LE.[item_skey] = C.item_Skey
	LEFT JOIN Support.Item_Metadata M ON C.Item_BusinessKey = M.Item_BusinessKey
	c.SCD_Is_Current = 1
	AND c.Item_Type = 2	
	AND M.Item_Type = 'Main'
DECLARE @Item_ID varchar(100)
DECLARE @FinalAction VARCHAR(50)
OPEN DeprecatedReportsCursor
FETCH NEXT FROM DeprecatedReportsCursor INTO @Item_ID, @FinalAction, @Item_Path
IF @FinalAction = 'HIDE'
	SET @SQL = '
	UPDATE c SET Hidden = 1
	FROM reportserver.dbo.CATALOG c
	WHERE ItemID = ''' + @Item_ID + ''''
	PRINT 'Report Hidden'
IF @FinalAction = 'DELETE'
	SET @sql = 'EXEC ReportServer.dbo.DeleteObject @Path=N''' + @Item_Path +  ''',@AuthType=1,@Prefix=N''' + @Item_Path + '%'''
	PRINT 'Report deleted'
IF @FinalAction = 'NOTHING'
	PRINT 'nothing'
EXEC (@sql)
FETCH NEXT FROM DeprecatedReportsCursor INTO @Item_ID, @FinalAction, @Item_Path
CLOSE DeprecatedReportsCursor	
DEALLOCATE DeprecatedReportsCursor

This process is very similar to the process above, we have a CURSOR that loops over all the reports that are past the final notice period and either does nothing, hides or deletes the report.

As stated above this is all done in the same way as the report server to avoid causing any internal issues (use this at your own risk).


I have this as a separate job which runs the evening of the first job.


With that you have an automated report expiration process for SSRS.


Thanks for reading.


SSRSDB – Loading the SSRS data warehouse report monitor


Hi everyone,

This is the 3rd instalment to my series looking at reporting on SQL Server Reporting Services. In the last section we looked at the data warehouse. In this section we will look at loading the DW with all the data we need to effectivly monitor our SSRS environment.

The ETL process is formed from 2 steps.

Step 1 – Back up the existing ReportServer and restore it to a new database “ReportServer_Clone”. There are several reasons why you might want to do this, the main one is to remove any strain from your production server. Microsoft does not advise that you make alterations to the ReportServer database so we are goon on that front too. The final stage is to add a view and persist that view to a table. This is a very expensive view which is why it is persisted. This view is used for most of the load process so persisting to a table enables the DW to load faster.

Step 2 – The ETL phase. The ETL has 5 phase.
Phase 1 – Load variables. The SSIS packages is designed to be as dynamic as possible and fit to your environment. When you create the SSRSDB there is a table called “Configuration” under the “Support” schema. This holds all the connection and dynamic variables.


Phase 2 – Load the dimensions.
You will notice that there are less dimensions being loaded than were created in the build process. This is because some are enumerator tables and will not change over the life of your versions of SSRS, for that reason there is no need to load them daily.


For all other dimensions these are loaded using TSQL MERGE, to enable this process we need a staging table to make the comparison. Essentially what a TSQL MERGE does is an UPSERT, an UPDATE and INSERT. The dimensions are all type 2 slowly changing dimensions. So when a change is registered we are closing off the old row and adding a new one.

Each sequence container contains a truncate staging EXECUTE SQL task followed by a DATA TRANSFER task to move the data from the ReportServer_Clone database to the staging tables. There is a degree of transformation that also happens, mainly around converting INT tables to expanded results. Example, IsHidden is a boolean value BIT value (0 OR 1) when filtering this in a report it is not obvious what 0 and 1 relate to, so this is transformed to be “Hidden” and “Not Hidden”.

The dimensions being loaded here are Dim_Catalog – report details, Dim_Format – Format of the report, Dim_User – who has been using the report, Dim_Role – what role does the use have and finally Dim_Dataset.


Once the data has been loaded in to staging it passed in to the TSQL MERGE.

Phase 3 – Fact loading. The facts are loaded using the LOOKUP design pattern. When you set up the SSRSDB you will have a set of historic data, when this is first loaded you want all that historic data. After that has been loaded there is no point reloading all the data as this is just a waste of IO. Instead as part of the configuration once the ETL has been ran once it will set the “full load” option to “Incremental”. Once this has happened only the data loaded that day will be removed.

Three fact tables are being loaded Fact_transaction_Execution, Fact_Factless_PUR and Fact_Factless_Dataset_fields.


The first step is to either remove today’s data or all the data (dependant on whether this is the first time you’re loading the SRSDB). Once the data is removed we move the data from the ReportSerer_Clone database to the SSRSDB.

(1) A query to extract the data and transform it, (2) look up the current Item_skey, (3) look up the current Formay_skey, (4) look up the current User_skey, (5) look up the current Action_skey, (6) look up the current Source_skey and finally load to SSRSDB.


The same process happens for the other 3 fact tables.

Phase 4 – Loading and updating the metadata. Part of the power of SSRSDB is the extended metadata. This is achieved using a simple Excel spreadsheet. When SSRSDB is created a table is also created which hold the metadata, Support.Item_Metadata. The columns in this table mirror that of the Excel Spread sheet.


The metadata allows you to store additional data about the report which will be used to automate communicating changes/usage of reports. The metadata stores the style (theme management blog coming soon), date it was created, who requested it, what type of report is it (Main or drilldown), the report owners email (again for automation) In active period (how many days can a report go without being used, what happens after that time, final period and what happens.

When you first use the SSRSDB this will be blank, rather than manually populating this there is a process to read the file and write out the existing contents of the ReportServer, any changes after this will be updated.

The final phase is to read the ReportServer logs. This is a for each loop to loop over each log and load it in to the SSRSDB.

In the download attached you will also find a set of jobs that you can create. Once these have been created and scheduled you have a fully automated ReportServer data warehouse. I personally have this scheduled to run every 30 minutes.

Thanks for reading.

Power BI & ODI Node Devon

2015-06-26 10.18.47

Hi everyone,

Over the last 2 days I have been partaking in lots of data related fun! On Friday 26th June I attended the ODI Devon’s first conference “Beyond the smart city” #btsc15. It was a great event. As part of this Myself and the SQL South West team were asked to do some technical sessions on Saturday (27th). I love getting out there and meeting new people so jumped at the opportunity. The session was aimed at individuals who might not know the MS BI stack very well, or someone who knew what there was but possibly not how to use it – No deep dives, etc.

My session was about exploring “Open data”. To do this I used a variety of Microsoft tools (Excel, PowerPivot, Azure, Azure SQL Database, SSIS and Power BI). I wanted to use this blog post the walk you through how I went about preparing for this session and as a reference for those who attended the session. I had one goal, to demonstrate how anyone can quickly “Ask questions about their data”.

So what did we do?

  1. We looked at where we can get open data
  2. We analysed that data in Excel using pivot tables, answering basic questions
  3. We wanted more data and somewhere to store it so we created an Azure SQL database
  4. We loaded the data into Azure using SSIS
  5. We analysed the data further in PowerPivot and built a model, but it wasn’t good enough
  6. We imported out PowerPivot model in to Power BI and started asking questions of our data

Let’s look at that in a bit more detail.

To start off I looked at where to find open data. For my examples I used the open data provided by Exeter council. On Exeter council’s website there is a lot of interesting rich datasets. This included property listings owned by ECC and also their spend over the last 2 years. I used both of these datasets as source data. Much like most open data, you get what you pay for. There was a small issue with the data, it in separate files – there were 15 files for the last 15 months.

Sure you can use Excel, and copy and paste each file in to one long list, but what if you want that stored somewhere you can quickly get to, or somewhere you can share that data with colleagues? You guessed it, we need a database to store this data. This session was designed for anyone to attend and the replicate at home. I didn’t want attendees to have to download SQL Express, and a lot of what I am demonstrating is cloud based, so Azure was the obvious choice.

So we needed a database in Azure (for this demo I am using the newer portal V2). Head over to and create an account. Once there complete the following steps to spin up a brand new SQL database.

  1. Open the market place and search for SQL Database
  2. Hit create
  3. Give your database a name and configure the settings
    • Server – Create a new server as required.
    • Choose you pricing model
    • Configure further as required
    • Hit create

Within 40 seconds you have a working database in the cloud. If you have trouble logging in to the database you may need to alter the firewall settings to allow the IP address you’re connecting from.

Now we have our functioning database we can begin to load it with data. To do this we first need to create a few tables to store the data.

First we create the spend data:

CREATE TABLE [dbo].[SpendData](
[Body Name] [varchar](50) NULL,
[Trans Date] [datetime2](7) NULL,
[Supplier Name] [varchar](50) NULL,
[Trans Ref] [int] NULL,
[Net Amount] [decimal](15, 2) NULL,
[Expense Description] [varchar](50) NULL,
[Expense Code] [int] NULL,
[Service Code] [varchar](50) NULL,
[BVACOP] [varchar](50) NULL,
[Category] [int] NULL

Then the building location data

CREATE TABLE [dbo].[BuildingLocations](
[Uprn] [varchar](50) NULL,
[AssetRef] [varchar](50) NULL,
[AssetName] [varchar](1000) NULL,
[Postcode] [varchar](50) NULL,
[Easting] [decimal](18, 0) NULL,
[Northing] [decimal](18, 0) NULL,
[InterestType] [varchar](50) NULL,
[AssetType] [varchar](150) NULL,
[AssetUse] [varchar](150) NULL

I also used a script from codeproject to dynamically create a date tally table (These are so important. If you don’t have a numbers table and a dates tally then go and make one now) Now we have 3 tables, 2 empty and 1 with 20 years of dates. Let’s load the missing data.


To do this I have used SQL Server Integration services – SSIS. SSIS is Microsoft’s ETL tool for moving and transforming data. The package is designed to be run several times to enable this the first step is to TRUNCATE the existing tables. After that we want to loop over all the files (I have simply pulled all the data from the ECC website) then a final transfer for the building data. The more data you have the more that will be imported – just keep adding the spend data to the folder. This SSIS package is available to download at the bottom of the page.


Once the data was loaded we headed over to Excel again a loaded the data in to a PowerPivot model. I am using Excel 2013 so PowerPivot is already installed but not enabled. For those using 2010 go and grab the add-in from here. Once installed enable the add-in and open PowerPivot. On the Home ribbon we looked at how to import the data from Azure.

Click on From database
Select SQL Server
Fill in credentials
Select tables.


Now we have our data we build our basic model by using the diagram view to connect the 2 tables.


Now we can ask more complicated questions of our data such as which quarter is the most money spent, we can analyse by month, year and any other column in our date tally. But really this isn’t good enough! Our end user needs to know a lot about Excel to really get what they need. We are creating something for someone who really has no idea about BI tools, maybe you managing director. So to really begin to ask questions of our data we need Power BI, in particular the Power BI Q & A.

At the time of writing, Power BI is still in preview. There is no cost structure in place and you need a corporate account to use it (You can’t use a gmail or live account). Head over to and register yourself.


What we want to do is save our PowerPivot model and suck that up in to Power BI.

  1. Click on the big yellow “Get Data” button in the bottom left to begin
  2. From there we want to get data from files
  3. Select “Local File”
  4. Navigate to the PowerPivot model
  5. Hit open

You should see an “Importing data” window


Once that is complete we can start literally asking questions of our data. Hit the pin icon when you’re happy to pin the item to your dashboard. You can start creating interactive reports that link from your dashboard and offer a more detailed view of your data.


That is a very high level introduction to working with data in Power BI. I will do some more blogs on Power BI in the coming weeks, once I get my head around the Power BI API :S

Thanks for reading

(Download files here)

SQL Saturday #411 Cambridge


WOW! What a week. In less than 7 days I have been accepted to speak at 3 SQL Saturday conferences! The schedule currently looks like this:

July – SQL Saturday Manchester
August – SQL Saturday Oslo
September – SQL Saturday Cambridge

SQL Saturday in the UK has become something of a regular occurrence. In 2015 alone have 4 (Exeter, Manchester, Cambridge and Southampton), but it all started in Cambridge with the first UK based SQL Saturday back in 2012. I missed that event but went to the following event in 2013. I had a blast! The only downside to the conference was that I didn’t spend enough time in Cambridge! I am hoping to change that this time and take in some of the sights. What I remember from last time were bikes! Bikes everywhere! I don’t have the best luck with bicycles – Ask me when you see me – I will show you the scars….

This will be the 3rd time that Mark and the team have organised SQL Saturday Cambridge and they have certainly gone big! 2 full days of training and 6 tracks on the Saturday! That is bigger than big! That is MASSIVE! In 2013 SQL Saturday Cambridge tried something different by making the even half SQL half SharePoint. I don’t use SharePoint, but like many I administer it from time-to-time. In 2013 I went to a SharePoint session and really enjoyed it. I will do the same in September.

Looking forward to seeing you there. Register now:

Thursday 10th September 2015

Penelope Coventry – Automating business processes with SharePoint, Office 365 and Azure

Benjamin Nevarez – SQL Server Query Tuning & Optimization

Niko Neugebauer – A Day with Columnstore Indexes

Friday 11th September 2015

Rasmus Reinholdt Nielsen – Building tomorrows ETL architecture with BIML and MDS today

Kennie Nybo Pontoppidan – SQL Anti-Patterns

Mark Broadbent & Regis Baccaro – Implementing Business Intelligence & SharePoint High Availability Solutions

Microsoft – Building a Data Driven Strategy

SQL Saturday #418 Manchester


A few days ago I posted that I had been selected to speak at SQL Saturday in Oslo. You know what they say, “SQL Saturday’s are like buses. You wait for one and two come along at the same time”. I am really happy to announce that I will be speaking at SQL Saturday in Manchester UK. On the 24th and 25th of July Manchester will play host to a SQL Saturday!

Not only is this the first time there has been a SQL Saturday in Manchester, this is the first time I will have been to Manchester! I live in the Southwest of the UK in Exeter and I never have a great need to head further north than Bristol (83 miles or 133 KM – terrible I know) so I am really looking forward to this event!

**Edited 10/06/2015**

When I originally wrote this post there was not a great deal of information about the training days on offer at SQL Saturday Manchester. Well the team have pushed an update out and they are great (So great I had to come back and edit the post). Manchester has 6 training day session.

1. Building Big Data Solutions with Microsoft Azure HDInsight – Graeme Malcolm | Microsoft
2. Machine Learning without a PhD in statistics – Andrew Fryer | Microsoft
3. SQL Server Business Intelligence – Chris Testa-O’Neill
4. SQL Server 2014 In-memory technologies – Tony Rogerson
5. Practical R: A hands-on training day – Steph Locke
6. Data Management Lifecycle by Redgate

I am hoping to extend my visit in Manchester and see Mr Fryer’s session on Machine Learning. If I could split myself I would go and see Steph Locke’s session. Steph did a similar session in Exeter and it was a great success!

Make sure you check out the event/schedule and I will see you on July 25th in Manchester!

SQL Saturday Oslo #414


At home my wife and I have a world map with all the places we would like to visit, slowly we are crossing these off (there is a lot of places we really want to visit). Oslo has been on there since last year’s SQL Saturday #317, there was a lot of buzz on Twitter up and on the day. I heard from those who attended that it was a great event! I knew I wanted to attend the next one as an attendee, however I have recently been attempting to improve my presenting skills and also to give back some of the time I have benefited from to the SQL community. Long story short I submitted and was selected  – YAY! I was really pleased to have been 1 of the staggering 151 sessions submitted – The SQL community is truly one-of-a-kind!

So, I am hoping you will join me and the other great speakers in Oslo, Norway on August 29th for my session SSRS Inception: Reporting on Reporting Services. You can read more about this project here.

Don’t forget that there is so much more on offer than just the Saturday. Head over to the SQL Saturday website for more information or follow the #SQLSatOslo hashtag for all the updates! One last thing. If you’re heading to Oslo why not extend your visit and see one of the great training days the Oslo team have put on. They both sound like fantastic sessions!

Friday, August 28th A full day of paid pre-conference workshops will be held on Friday, August 28th:

Margarita NaumovaA Practical Approach for Troubleshooting Performance Problems (Work Smarter not Harder) by Margarita Naumova
Level 300 – Advanced
Finding the source of a performance problem is one of the most difficult tasks for every DBA. Where to start, what to check, how to find what’s going on, and of course how to fix, those are usually the most common and important questions every DBA asks. And those questions usually come when the time is the most critical resource.
Eventbrite - A Practical Approach for Troubleshooting Performance Problems (Work Smarter not Harder) by Margarita Naumova


David Peter HansenUnderstanding SSIS Internals and Performance Tuning by David Peter Hansen
Level 300 – Advanced
You have worked with SQL Server Integration Services (SSIS) for a few years, but more often than not, you see performance problems with your packages. Some packages are running slow, while others are taking up more memory and CPU than before.

Eventbrite - Understanding SSIS Internals and Performance Tuning by David Peter Hansen


Head over to the #SQLSatOslo and register now.