This is the 3rd instalment to my series looking at reporting on SQL Server Reporting Services. In the last section we looked at the data warehouse. In this section we will look at loading the DW with all the data we need to effectivly monitor our SSRS environment.
The ETL process is formed from 2 steps.
Step 1 – Back up the existing ReportServer and restore it to a new database “ReportServer_Clone”. There are several reasons why you might want to do this, the main one is to remove any strain from your production server. Microsoft does not advise that you make alterations to the ReportServer database so we are goon on that front too. The final stage is to add a view and persist that view to a table. This is a very expensive view which is why it is persisted. This view is used for most of the load process so persisting to a table enables the DW to load faster.
Step 2 – The ETL phase. The ETL has 5 phase.
Phase 1 – Load variables. The SSIS packages is designed to be as dynamic as possible and fit to your environment. When you create the SSRSDB there is a table called “Configuration” under the “Support” schema. This holds all the connection and dynamic variables.
Phase 2 – Load the dimensions.
You will notice that there are less dimensions being loaded than were created in the build process. This is because some are enumerator tables and will not change over the life of your versions of SSRS, for that reason there is no need to load them daily.
For all other dimensions these are loaded using TSQL MERGE, to enable this process we need a staging table to make the comparison. Essentially what a TSQL MERGE does is an UPSERT, an UPDATE and INSERT. The dimensions are all type 2 slowly changing dimensions. So when a change is registered we are closing off the old row and adding a new one.
Each sequence container contains a truncate staging EXECUTE SQL task followed by a DATA TRANSFER task to move the data from the ReportServer_Clone database to the staging tables. There is a degree of transformation that also happens, mainly around converting INT tables to expanded results. Example, IsHidden is a boolean value BIT value (0 OR 1) when filtering this in a report it is not obvious what 0 and 1 relate to, so this is transformed to be “Hidden” and “Not Hidden”.
The dimensions being loaded here are Dim_Catalog – report details, Dim_Format – Format of the report, Dim_User – who has been using the report, Dim_Role – what role does the use have and finally Dim_Dataset.
Once the data has been loaded in to staging it passed in to the TSQL MERGE.
Phase 3 – Fact loading. The facts are loaded using the LOOKUP design pattern. When you set up the SSRSDB you will have a set of historic data, when this is first loaded you want all that historic data. After that has been loaded there is no point reloading all the data as this is just a waste of IO. Instead as part of the configuration once the ETL has been ran once it will set the “full load” option to “Incremental”. Once this has happened only the data loaded that day will be removed.
Three fact tables are being loaded Fact_transaction_Execution, Fact_Factless_PUR and Fact_Factless_Dataset_fields.
The first step is to either remove today’s data or all the data (dependant on whether this is the first time you’re loading the SRSDB). Once the data is removed we move the data from the ReportSerer_Clone database to the SSRSDB.
(1) A query to extract the data and transform it, (2) look up the current Item_skey, (3) look up the current Formay_skey, (4) look up the current User_skey, (5) look up the current Action_skey, (6) look up the current Source_skey and finally load to SSRSDB.
The same process happens for the other 3 fact tables.
Phase 4 – Loading and updating the metadata. Part of the power of SSRSDB is the extended metadata. This is achieved using a simple Excel spreadsheet. When SSRSDB is created a table is also created which hold the metadata, Support.Item_Metadata. The columns in this table mirror that of the Excel Spread sheet.
The metadata allows you to store additional data about the report which will be used to automate communicating changes/usage of reports. The metadata stores the style (theme management blog coming soon), date it was created, who requested it, what type of report is it (Main or drilldown), the report owners email (again for automation) In active period (how many days can a report go without being used, what happens after that time, final period and what happens.
When you first use the SSRSDB this will be blank, rather than manually populating this there is a process to read the file and write out the existing contents of the ReportServer, any changes after this will be updated.
The final phase is to read the ReportServer logs. This is a for each loop to loop over each log and load it in to the SSRSDB.
In the download attached you will also find a set of jobs that you can create. Once these have been created and scheduled you have a fully automated ReportServer data warehouse. I personally have this scheduled to run every 30 minutes.
Thanks for reading.