Tuesday, May 24, 2016

Ruminating on Power BI

We were building our dashboard using Power BI and were looking at the various options available to refresh the data.

The following link would give a good overview of the various data refresh options -https://powerbi.microsoft.com/en-us/documentation/powerbi-refresh-data/#databases-in-the-cloud

It is also possible to pump in live streaming data to Power BI using it's REST APIs - https://powerbi.microsoft.com/en-us/documentation/powerbi-developer-overview-of-power-bi-rest-api/

But we were a bit concerned about the dataset size limit of 10 GB in Power BI Pro version. An excellent article by Reza Rad - http://www.radacad.com/step-beyond-the-10gb-limitation-of-power-bi

Essentially in Power BI, you have two options - either import the entire dataset into memory OR establish a live connection between Power BI and your data-source.

Power BI uses some nifty compression techniques for all data that is imported into it - Reza observed a compression from 800 MB file to 8 MB Power BI file. Hence for all practical purposes, a 10 GB limit should suffice for most use-cases.
In case you are working with large volumes of data (GB, TB, PB), then a live connection with the data-source is the only option.

Some snippets from Reza's article:
"Live connection won’t import data into the model in Power BI. Live connection brings the metadata and data structure into Power BI, and then you can visualize data based on that. With every visualization, a query will be sent to the data source and brings the response.

Limitations of Live Connection - 
1. With Live connection, there won’t be any Data tab in Power BI to create calculated measures, columns or tables. You have to create all calculations at the data source level.
2. Multiple Data Sources is not supported.
3. No Power Q&A
4. Power Query still is available with Live Connection. This gives you ability to join tables, flatten them if you require, apply data transformation and prepare the data as you want. Power Query can also set the data types in a way that be more familiar for the Power BI model to understand.
5. You need to do proper index and query optimization at data-source."