Power BI & ODI Node Devon

2015-06-26 10.18.47

Hi everyone,

Over the last 2 days I have been partaking in lots of data related fun! On Friday 26th June I attended the ODI Devon’s first conference “Beyond the smart city” #btsc15. It was a great event. As part of this Myself and the SQL South West team were asked to do some technical sessions on Saturday (27th). I love getting out there and meeting new people so jumped at the opportunity. The session was aimed at individuals who might not know the MS BI stack very well, or someone who knew what there was but possibly not how to use it – No deep dives, etc.

My session was about exploring “Open data”. To do this I used a variety of Microsoft tools (Excel, PowerPivot, Azure, Azure SQL Database, SSIS and Power BI). I wanted to use this blog post the walk you through how I went about preparing for this session and as a reference for those who attended the session. I had one goal, to demonstrate how anyone can quickly “Ask questions about their data”.

So what did we do?

  1. We looked at where we can get open data
  2. We analysed that data in Excel using pivot tables, answering basic questions
  3. We wanted more data and somewhere to store it so we created an Azure SQL database
  4. We loaded the data into Azure using SSIS
  5. We analysed the data further in PowerPivot and built a model, but it wasn’t good enough
  6. We imported out PowerPivot model in to Power BI and started asking questions of our data

Let’s look at that in a bit more detail.

To start off I looked at where to find open data. For my examples I used the open data provided by Exeter council. On Exeter council’s website there is a lot of interesting rich datasets. This included property listings owned by ECC and also their spend over the last 2 years. I used both of these datasets as source data. Much like most open data, you get what you pay for. There was a small issue with the data, it in separate files – there were 15 files for the last 15 months.

Sure you can use Excel, and copy and paste each file in to one long list, but what if you want that stored somewhere you can quickly get to, or somewhere you can share that data with colleagues? You guessed it, we need a database to store this data. This session was designed for anyone to attend and the replicate at home. I didn’t want attendees to have to download SQL Express, and a lot of what I am demonstrating is cloud based, so Azure was the obvious choice.

So we needed a database in Azure (for this demo I am using the newer portal V2). Head over to https://portal.azure.com/ and create an account. Once there complete the following steps to spin up a brand new SQL database.

  1. Open the market place and search for SQL Database
  2. Hit create
  3. Give your database a name and configure the settings
    • Server – Create a new server as required.
    • Choose you pricing model
    • Configure further as required
    • Hit create

Within 40 seconds you have a working database in the cloud. If you have trouble logging in to the database you may need to alter the firewall settings to allow the IP address you’re connecting from.

Now we have our functioning database we can begin to load it with data. To do this we first need to create a few tables to store the data.

First we create the spend data:

CREATE TABLE [dbo].[SpendData](
[Body Name] [varchar](50) NULL,
[Trans Date] [datetime2](7) NULL,
[Supplier Name] [varchar](50) NULL,
[Trans Ref] [int] NULL,
[Net Amount] [decimal](15, 2) NULL,
[Expense Description] [varchar](50) NULL,
[Expense Code] [int] NULL,
[Service Code] [varchar](50) NULL,
[BVACOP] [varchar](50) NULL,
[Category] [int] NULL
)

Then the building location data

CREATE TABLE [dbo].[BuildingLocations](
[Uprn] [varchar](50) NULL,
[AssetRef] [varchar](50) NULL,
[AssetName] [varchar](1000) NULL,
[Postcode] [varchar](50) NULL,
[Easting] [decimal](18, 0) NULL,
[Northing] [decimal](18, 0) NULL,
[InterestType] [varchar](50) NULL,
[AssetType] [varchar](150) NULL,
[AssetUse] [varchar](150) NULL
)

I also used a script from codeproject to dynamically create a date tally table (These are so important. If you don’t have a numbers table and a dates tally then go and make one now) http://www.codeproject.com/Articles/647950/Create-and-Populate-Date-Dimension-for-Data-Wareho. Now we have 3 tables, 2 empty and 1 with 20 years of dates. Let’s load the missing data.

2015-06-27_17-09-18

To do this I have used SQL Server Integration services – SSIS. SSIS is Microsoft’s ETL tool for moving and transforming data. The package is designed to be run several times to enable this the first step is to TRUNCATE the existing tables. After that we want to loop over all the files (I have simply pulled all the data from the ECC website) then a final transfer for the building data. The more data you have the more that will be imported – just keep adding the spend data to the folder. This SSIS package is available to download at the bottom of the page.

2015-06-27_17-12-01

Once the data was loaded we headed over to Excel again a loaded the data in to a PowerPivot model. I am using Excel 2013 so PowerPivot is already installed but not enabled. For those using 2010 go and grab the add-in from here. Once installed enable the add-in and open PowerPivot. On the Home ribbon we looked at how to import the data from Azure.

Click on From database
Select SQL Server
Fill in credentials
Select tables.

2015-06-27_17-20-36

Now we have our data we build our basic model by using the diagram view to connect the 2 tables.

2015-06-27_18-28-22

Now we can ask more complicated questions of our data such as which quarter is the most money spent, we can analyse by month, year and any other column in our date tally. But really this isn’t good enough! Our end user needs to know a lot about Excel to really get what they need. We are creating something for someone who really has no idea about BI tools, maybe you managing director. So to really begin to ask questions of our data we need Power BI, in particular the Power BI Q & A.

At the time of writing, Power BI is still in preview. There is no cost structure in place and you need a corporate account to use it (You can’t use a gmail or live account). Head over to https://powerbi.microsoft.com/ and register yourself.

2015-06-27_18-57-17

What we want to do is save our PowerPivot model and suck that up in to Power BI.

  1. Click on the big yellow “Get Data” button in the bottom left to begin
  2. From there we want to get data from files
  3. Select “Local File”
  4. Navigate to the PowerPivot model
  5. Hit open

You should see an “Importing data” window

2015-06-27_18-41-52

Once that is complete we can start literally asking questions of our data. Hit the pin icon when you’re happy to pin the item to your dashboard. You can start creating interactive reports that link from your dashboard and offer a more detailed view of your data.

2015-06-27_18-42-41

That is a very high level introduction to working with data in Power BI. I will do some more blogs on Power BI in the coming weeks, once I get my head around the Power BI API :S

Thanks for reading

(Download files here)

SQL Saturday #411 Cambridge

img_44961

WOW! What a week. In less than 7 days I have been accepted to speak at 3 SQL Saturday conferences! The schedule currently looks like this:

July – SQL Saturday Manchester
August – SQL Saturday Oslo
September – SQL Saturday Cambridge

SQL Saturday in the UK has become something of a regular occurrence. In 2015 alone have 4 (Exeter, Manchester, Cambridge and Southampton), but it all started in Cambridge with the first UK based SQL Saturday back in 2012. I missed that event but went to the following event in 2013. I had a blast! The only downside to the conference was that I didn’t spend enough time in Cambridge! I am hoping to change that this time and take in some of the sights. What I remember from last time were bikes! Bikes everywhere! I don’t have the best luck with bicycles – Ask me when you see me – I will show you the scars….

This will be the 3rd time that Mark and the team have organised SQL Saturday Cambridge and they have certainly gone big! 2 full days of training and 6 tracks on the Saturday! That is bigger than big! That is MASSIVE! In 2013 SQL Saturday Cambridge tried something different by making the even half SQL half SharePoint. I don’t use SharePoint, but like many I administer it from time-to-time. In 2013 I went to a SharePoint session and really enjoyed it. I will do the same in September.

Looking forward to seeing you there. Register now: http://www.sqlsaturday.com/411/EventHome.aspx

Thursday 10th September 2015

Penelope Coventry – Automating business processes with SharePoint, Office 365 and Azure

Benjamin Nevarez – SQL Server Query Tuning & Optimization

Niko Neugebauer – A Day with Columnstore Indexes

Friday 11th September 2015

Rasmus Reinholdt Nielsen – Building tomorrows ETL architecture with BIML and MDS today

Kennie Nybo Pontoppidan – SQL Anti-Patterns

Mark Broadbent & Regis Baccaro – Implementing Business Intelligence & SharePoint High Availability Solutions

Microsoft – Building a Data Driven Strategy

SQL Saturday #418 Manchester

2015-06-09_21-10-26

A few days ago I posted that I had been selected to speak at SQL Saturday in Oslo. You know what they say, “SQL Saturday’s are like buses. You wait for one and two come along at the same time”. I am really happy to announce that I will be speaking at SQL Saturday in Manchester UK. On the 24th and 25th of July Manchester will play host to a SQL Saturday!

Not only is this the first time there has been a SQL Saturday in Manchester, this is the first time I will have been to Manchester! I live in the Southwest of the UK in Exeter and I never have a great need to head further north than Bristol (83 miles or 133 KM – terrible I know) so I am really looking forward to this event!

**Edited 10/06/2015**

When I originally wrote this post there was not a great deal of information about the training days on offer at SQL Saturday Manchester. Well the team have pushed an update out and they are great (So great I had to come back and edit the post). Manchester has 6 training day session.

1. Building Big Data Solutions with Microsoft Azure HDInsight – Graeme Malcolm | Microsoft
2. Machine Learning without a PhD in statistics – Andrew Fryer | Microsoft
3. SQL Server Business Intelligence – Chris Testa-O’Neill
4. SQL Server 2014 In-memory technologies – Tony Rogerson
5. Practical R: A hands-on training day – Steph Locke
6. Data Management Lifecycle by Redgate

I am hoping to extend my visit in Manchester and see Mr Fryer’s session on Machine Learning. If I could split myself I would go and see Steph Locke’s session. Steph did a similar session in Exeter and it was a great success!

Make sure you check out the event/schedule and I will see you on July 25th in Manchester!

SQL Saturday Oslo #414

2015-06-06_10-55-39

At home my wife and I have a world map with all the places we would like to visit, slowly we are crossing these off (there is a lot of places we really want to visit). Oslo has been on there since last year’s SQL Saturday #317, there was a lot of buzz on Twitter up and on the day. I heard from those who attended that it was a great event! I knew I wanted to attend the next one as an attendee, however I have recently been attempting to improve my presenting skills and also to give back some of the time I have benefited from to the SQL community. Long story short I submitted and was selected  – YAY! I was really pleased to have been 1 of the staggering 151 sessions submitted – The SQL community is truly one-of-a-kind!

So, I am hoping you will join me and the other great speakers in Oslo, Norway on August 29th for my session SSRS Inception: Reporting on Reporting Services. You can read more about this project here.

Don’t forget that there is so much more on offer than just the Saturday. Head over to the SQL Saturday website for more information or follow the #SQLSatOslo hashtag for all the updates! One last thing. If you’re heading to Oslo why not extend your visit and see one of the great training days the Oslo team have put on. They both sound like fantastic sessions!

Friday, August 28th A full day of paid pre-conference workshops will be held on Friday, August 28th:

Margarita NaumovaA Practical Approach for Troubleshooting Performance Problems (Work Smarter not Harder) by Margarita Naumova
Level 300 – Advanced
Finding the source of a performance problem is one of the most difficult tasks for every DBA. Where to start, what to check, how to find what’s going on, and of course how to fix, those are usually the most common and important questions every DBA asks. And those questions usually come when the time is the most critical resource.
Eventbrite - A Practical Approach for Troubleshooting Performance Problems (Work Smarter not Harder) by Margarita Naumova

 

David Peter HansenUnderstanding SSIS Internals and Performance Tuning by David Peter Hansen
Level 300 – Advanced
You have worked with SQL Server Integration Services (SSIS) for a few years, but more often than not, you see performance problems with your packages. Some packages are running slow, while others are taking up more memory and CPU than before.

Eventbrite - Understanding SSIS Internals and Performance Tuning by David Peter Hansen

 

Head over to the #SQLSatOslo and register now.