Power BI Security & Architecture

Are you an enterprise, CIO or IT decision maker?

Before investing your budgets in a modern BI tool for your organization, we strongly advise to evaluate your BI vendors security and architecture. Whether the tool is Power BI, Tableau, Qlik or Looker, each of these tools provide a cloud BI solution for your needs.

You have cloud and on-premise versions. Using the cloud version offers several known advantages. However, data security becomes the key. There are several questions that might be bothering you.

Is my data secure?

Where is my data stored?

What security options and best practices does the vendor implement?

How is the data movement?

Is the data encrypted? What all is encrypted?

Does this sound like you?


If you are looking for answers to above questions or evaluating Power BI as your go to modern Enterprise BI tool, I invite you to read Power BI security whitepaper which talks about Power BI security and architecture in detail.

To summarize

Power BI is a SaaS platform by Microsoft hosted on Azure. It uses Azure services for its operation. There are Web Front End clusters and Back End clusters.

The WFE and Back End
Image source: Microsoft

Front End cluster

The frontend cluster (WFE) is responsible for initiation and authentication to the Power BI service, sending static files and content.

The WEF Cluster
Frontend (WFE) cluster

Back End cluster

The Back-end cluster role comes into play once the authentication is done. This cluster is responsible for data, storage, visualization, connections, refresh, and other user interactions etc.

The Back-End Cluster
Back-end cluster

The Back End cluster is the heart. If you consider your data as your asset, then the Back End cluster is a critical asset.

You should particularly focus on items to the left of the dotted line above and items to the right of the dotted line. A request to get data, dashboards or reports goes to “Gateway Role” only. This Gateway Role decides where to route the request.

Snippet from the security paper:

The Gateway Role acts as a gateway between user requests and the Power BI service. Users do not interact directly with any roles other than the Gateway Role.

Important: It is imperative to note that only Azure API Management (APIM) and Gateway (GW) roles are accessible through the public Internet. They provide authentication, authorization, DDoS protection, Throttling, Load Balancing, Routing, and other capabilities.

The dotted line in the Back-End cluster image, above, clarifies the boundary between the only two roles that are accessible by users (left of the dotted line), and roles that are only accessible by the system. When an authenticated user connects to the Power BI Service, the connection and any request by the client is accepted and managed by the Gateway Role and Azure API Management, which then interacts on the user’s behalf with the rest of the Power BI Service. For example, when a client attempts to view a dashboard, the Gateway Role accepts that request then separately sends a request to the Presentation Role to retrieve the data needed by the browser to render the dashboard.

The Gateway role
Back End cluster Gateway Role

Top questions asked by customers

Where is my data stored?

The data that you upload along with Power BI Report (PBIX) is stored in Azure Blob Storage. The metadata – data about dashboards, reports, refresh cycles etc. is stored in Azure SQL Database.

The data is stored in the region same as the Power BI tenant’s region.

Read more here: https://docs.microsoft.com/en-us/power-bi/whitepaper-powerbi-security#data-storage-and-movement

Is my data encrypted?

In the Power BI service, data is either at rest (data available to a Power BI user that is not currently being acted upon), or it is in process (for example: queries being run, data connections and models being acted upon, data and/or models being uploaded into the Power BI service, and other actions that users or the Power BI service may take on data that is actively being accessed or updated). Data that is in process is referred to as data in process. Data at rest in Power BI is encrypted. Data that is in transit, which means data being sent or received by the Power BI service, is also encrypted.The data at rest and in transit is encrypted.

Source: Power BI Whitepaper

Is Power BI Pro secure?

Power BI Pro is a shared environment. The Frontend and backend clusters could be shared between customers. Azure Blob Storage and Azure SQL Database could be shared between customers.

Is Power BI Premium secure?

When you initiate a Power BI Premium subscription, behind the scenes the back-end clusters are deployed to dedicated VMs. These VMs are dedicated to you and should not be shared between customers.

What happens when I login to app.powerbi.com?

Check this section in the whitepaper to know what happens behind the scenes when you try to access app.powerbi.com

All Power BI features in one page?

Check out this blog to see all Power BI features in one page!

Planning to migrate to Power BI?

Read this first: https://bigintsolutions.com/2020/04/21/migrate-to-power-bi/

What licensing options does Power BI support?

Power BI supports Power BI Pro and Power BI Premium licensing options. It also has a free version. If you need to know more about different licensing options, check out our Power BI Licensing guide.

I have more questions on security:

Read more here: https://docs.microsoft.com/en-us/power-bi/whitepaper-powerbi-security#power-bi-security-questions-and-answers

Conclusion

Power BI is a great Modern BI tool. When evaluating Power BI for Enterprises, we walk them through the architecture and security implementations in Power BI. This boosts enterprise customer confidence to take next big step in modernizing their reporting and analytics.


Next Steps?

Don’t hesitate to contact us today if you are looking for Power BI Enterprise deployment or want us to evaluate Power BI as your go to modern Enterprise BI tool.

Now you can share your data without worrying about data security

Namaste,

We all know that data is the new oil (and insights is the new king). With data getting generated from innumerable sources (Facebook, Twitter, YouTube, Snapchat, Uber, Web traffic, Google Searches), the data security should not just get limited to “contractual terms”.

Here are few facts about data:

“We create as much information in two days now as we did from the dawn of man through 2003.” – Eric Schmidt

90% of the world’s data was generated over the last two years.

Every second, 40,000 searches are performed on Google.

Every minute, 4.1 million YouTube videos are watched.

I must say, data never sleeps!

With data being the center point of everything, it’s a must to secure private and confidential information. We are not trying to solve the world’s data security problem. Instead, through this series of blog post, we would show techniques to anonymize and secure our customer’s data (while preserving analytic utility – re-read this line).

But, wait a second. What are we trying to solve? Through data security techniques, we would want to protect end-users’/end customers’ personal information (Name, email, phone, national Id), and protect other confidential and sensitive information like revenue, salary, internal data, patient health information, trip route, personal chat messages etc.

Removing or encrypting such attributes is not a solution as it would remove data’s analytic utility. Giving all this information without any control is also not a solution since it would lead to data privacy issues. What should we do then? Ideally, we would want to be in the middle of this curve. The privacy and risk should be at an acceptable level while preserving analytic utility.

privaycurve

Are there any methods that can help maintain an appropriate balance between privacy protection and data analytic utility? This is what we would learn in this series of blog posts.

Here’s how we would structure next set of posts:

  1. Types of identifiers (Direct identifiers and Quasi Identifiers)
  2. Methods and Techniques to protect these identifiers
    1. Suppression
    2. Generalization
    3. Randomization
    4. Pseudonymization
  3. Methods to protect:
    1. Cross-sectional data
    2. Longitudinal data
    3. Unstructured data
  4. Other data protection methods
    1. Mapping
    2. Rounding
    3. Top and Bottom coding
    4. Data synthesis
  5. Data sharing options
    1. Sub-sampling
    2. VPN – Protected infrastructure
  6. Conclusion

The purpose of this post was to give a premise of how we protect customer’s data and suggest practical approaches to data anonymization and sharing.

How do you protect this information? What tools and techniques you use? We would love to know.

Thanks,

R

References:

https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/#f56f49460ba9

https://techcrunch.com/2010/08/04/schmidt-data/

https://www.marketingprofs.com/charts/2017/32531/the-incredible-amount-of-data-generated-online-every-minute-infographic

http://www.ehealthinformation.ca/media/e-learning-courses-data-privacy-anonymization/e-learning-course-on-anonymizing-data/