July 2018 - Business Intelligence | Power BI | Machine Learning & AI

Resolved – Request is not a valid SAML 2.0 protocol message – when embedding Power BI Reports with federated authentication

Phew! Finally, we were able to resolve the error “Request is not a valid SAML 2.0” when embedding Power BI Reports with federated authentication. It took us some time but thanks to the wonderful Microsoft support team who worked with us in debugging and isolating the issues.

Our scenario: Enterprise customer with Power BI Premium capacity planning to embed Power BI reports in an internal application using “App Owns Data” approach. There are scenarios why would you embed for enterprises (also called as organizational embedding), and scenarios why would you use “App Owns Data” approach over “User owns Data” approach. More about this in another blog post.

Ok, then why this error? How to solve it?

Why this error:

When you authenticate using master account the request goes to a federated server (in this case customer’s Identity Provider (IdP)), the IdP validates the credentials, sends back SAML assertion and TokenType, the Azure AD .NET libraries check the TokenType and assigns granttype. This granttype and SAML assertion is sent to Azure AD for confirmation.

In our particular case, the PingFederate Identity server was using a TokenType which Azure AD .NET SDK assumed to be of 2.0 and hence tagged granttype as “2.0” (urn:ietf:params:oauth:grant-type:saml2-bearer). But the assertion was not 2.0, it was actually 1.1.

Hence the error – Request is not a valid SAML 2.0 protocol message.

How to solve this error?

There are two ways to solve this error:

Create a cloud account on customer’s tenant which would not be federated (simple solution), example: abc@tenantname.onmicrosoft.com
Create SAML requests manually, fire to your IdP, modify the TokenType in the code and send this request to Azure AD. You will have to bypass using Azure AD libraries and construct your own requests. (complex solution)

We went ahead with solution 1, used this cloud account as our master account and were able to successfully embed the reports in enterprise internal applications.

You will not face this issue if your IdP is ADFS.

Hope this helps,

Until then,

Ranbeer Makin

data security

Now you can share your data without worrying about data security

Namaste,

We all know that data is the new oil (and insights is the new king). With data getting generated from innumerable sources (Facebook, Twitter, YouTube, Snapchat, Uber, Web traffic, Google Searches), the data security should not just get limited to “contractual terms”.

Here are few facts about data:

“We create as much information in two days now as we did from the dawn of man through 2003.” – Eric Schmidt

90% of the world’s data was generated over the last two years.

Every second, 40,000 searches are performed on Google.

Every minute, 4.1 million YouTube videos are watched.

I must say, data never sleeps!

With data being the center point of everything, it’s a must to secure private and confidential information. We are not trying to solve the world’s data security problem. Instead, through this series of blog post, we would show techniques to anonymize and secure our customer’s data (while preserving analytic utility – re-read this line).

But, wait a second. What are we trying to solve? Through data security techniques, we would want to protect end-users’/end customers’ personal information (Name, email, phone, national Id), and protect other confidential and sensitive information like revenue, salary, internal data, patient health information, trip route, personal chat messages etc.

Removing or encrypting such attributes is not a solution as it would remove data’s analytic utility. Giving all this information without any control is also not a solution since it would lead to data privacy issues. What should we do then? Ideally, we would want to be in the middle of this curve. The privacy and risk should be at an acceptable level while preserving analytic utility.

privaycurve

Are there any methods that can help maintain an appropriate balance between privacy protection and data analytic utility? This is what we would learn in this series of blog posts.

Here’s how we would structure next set of posts:

Types of identifiers (Direct identifiers and Quasi Identifiers)
Methods and Techniques to protect these identifiers
1. Suppression
2. Generalization
3. Randomization
4. Pseudonymization
Methods to protect:
1. Cross-sectional data
2. Longitudinal data
3. Unstructured data
Other data protection methods
1. Mapping
2. Rounding
3. Top and Bottom coding
4. Data synthesis
Data sharing options
1. Sub-sampling
2. VPN – Protected infrastructure
Conclusion

The purpose of this post was to give a premise of how we protect customer’s data and suggest practical approaches to data anonymization and sharing.

How do you protect this information? What tools and techniques you use? We would love to know.

Thanks,

References:

https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/#f56f49460ba9

Eric Schmidt: Every 2 Days We Create As Much Information As We Did Up To 2003

https://www.marketingprofs.com/charts/2017/32531/the-incredible-amount-of-data-generated-online-every-minute-infographic

http://www.ehealthinformation.ca/media/

visualization

Who else wants Heat Stream analysis with Power BI?

Namaste! It’s been a tiring month – working on customer projects, building a product prototype, getting work done by my team, phew! – I’m donning multiple hats. Recently we wrapped up two projects on showing heat streams with Power BI. The projects were challenging, and you know customers will take out the best from you. And, it happened with us as well…

Heat streams could be very useful in analyzing large amount of data sets and analyzing patterns or “heats” over a period of time.

Some use cases of heat streams could be:

Analyze call center calls by weekday and time of the day. The time of the day as X-axis and weekdays as Y-axis with the number of calls as “heats”
Perform clickstream analysis for website clicks
Analyze Patient re-admissions and re-admission types in a hospital over a period of years

Usually, in a heat stream visual, we put the time of the day or date or year on X-axis, a discrete or a continuous value on Y-axis, and fill the visual with a discrete or a continuous value with gradient colors.

The code that we developed used ggplot and geom_raster layers along with various settings for formatting axes. This R code combined with Power BI gave us BI capabilities. The visuals were seamlessly sliced/diced based on the data we selected in Power BI. I’m attaching here screenshots of visuals that we created using R and Power BI.

Our customers were wowed by the output they saw from the data. Remember, if data is the new oil, then insights in the new king. And we do this using interesting and stunning visuals.

Screenshot 1:

Screenshot 2:

HeatStreamVisual2 — Heat stream Visual 2 using Power BI and R

Note: You need large amount of data to have this kind of output. We can further improvise these visuals to be interactive. This can easily be done using plotly and htmlwidgets library combined with Power BI.

The biggest challenge you will face in plotting such visuals is handling large amount of data points on “x-axis”. You may have to use breaks or cuts to limit the points.

Have you plotted heat streams in Power BI/R? What were the most challenging aspects of your project?

We would love to know.

Thank you

Note: Next week we will be starting a series of blog posts on how we secure customers data with data anonymization and masking techniques. There are some incredible techniques that we use which give our customers 100% confidence in data security.

Do subscribe to our blog posts to not miss our proven data masking techniques and other interesting articles.

Note: There is a custom visual for plotting heat streams in Power BI, but it cannot generate heat streams anything like what we have shown above.