Blog

Resolved – Request is not a valid SAML 2.0 protocol message – when embedding Power BI Reports with federated authentication

Phew! Finally, we were able to resolve the error “Request is not a valid SAML 2.0” when embedding Power BI Reports with federated authentication. It took us some time but thanks to the wonderful Microsoft support team who worked with us in debugging and isolating the issues.

Our scenario: Enterprise customer with Power BI Premium capacity planning to embed Power BI reports in an internal application using “App Owns Data” approach. There are scenarios why would you embed for enterprises (also called as organizational embedding), and scenarios why would you use “App Owns Data” approach over “User owns Data” approach. More about this in another blog post.

Ok, then why this error? How to solve it?

Why this error:

When you authenticate using master account the request goes to a federated server (in this case customer’s Identity Provider (IdP)), the IdP validates the credentials, sends back SAML assertion and TokenType, the Azure AD .NET libraries check the TokenType and assigns granttype. This granttype and SAML assertion is sent to Azure AD for confirmation.

In our particular case, the PingFederate Identity server was using a TokenType which Azure AD .NET SDK assumed to be of 2.0 and hence tagged granttype as “2.0” (urn:ietf:params:oauth:grant-type:saml2-bearer). But the assertion was not 2.0, it was actually 1.1.

Hence the error – Request is not a valid SAML 2.0 protocol message.

How to solve this error?

There are two ways to solve this error:

  1. Create a cloud account on customer’s tenant which would not be federated (simple solution), example: abc@tenantname.onmicrosoft.com
  2. Create SAML requests manually, fire to your IdP, modify the TokenType in the code and send this request to Azure AD. You will have to bypass using Azure AD libraries and construct your own requests. (complex solution)

We went ahead with solution 1, used this cloud account as our master account and were able to successfully embed the reports in enterprise internal applications.

You will not face this issue if your IdP is ADFS.

Hope this helps,

Until then,

Ranbeer Makin

Now you can share your data without worrying about data security

Namaste,

We all know that data is the new oil (and insights is the new king). With data getting generated from innumerable sources (Facebook, Twitter, YouTube, Snapchat, Uber, Web traffic, Google Searches), the data security should not just get limited to “contractual terms”.

Here are few facts about data:

“We create as much information in two days now as we did from the dawn of man through 2003.” – Eric Schmidt

90% of the world’s data was generated over the last two years.

Every second, 40,000 searches are performed on Google.

Every minute, 4.1 million YouTube videos are watched.

I must say, data never sleeps!

With data being the center point of everything, it’s a must to secure private and confidential information. We are not trying to solve the world’s data security problem. Instead, through this series of blog post, we would show techniques to anonymize and secure our customer’s data (while preserving analytic utility – re-read this line).

But, wait a second. What are we trying to solve? Through data security techniques, we would want to protect end-users’/end customers’ personal information (Name, email, phone, national Id), and protect other confidential and sensitive information like revenue, salary, internal data, patient health information, trip route, personal chat messages etc.

Removing or encrypting such attributes is not a solution as it would remove data’s analytic utility. Giving all this information without any control is also not a solution since it would lead to data privacy issues. What should we do then? Ideally, we would want to be in the middle of this curve. The privacy and risk should be at an acceptable level while preserving analytic utility.

privaycurve

Are there any methods that can help maintain an appropriate balance between privacy protection and data analytic utility? This is what we would learn in this series of blog posts.

Here’s how we would structure next set of posts:

  1. Types of identifiers (Direct identifiers and Quasi Identifiers)
  2. Methods and Techniques to protect these identifiers
    1. Suppression
    2. Generalization
    3. Randomization
    4. Pseudonymization
  3. Methods to protect:
    1. Cross-sectional data
    2. Longitudinal data
    3. Unstructured data
  4. Other data protection methods
    1. Mapping
    2. Rounding
    3. Top and Bottom coding
    4. Data synthesis
  5. Data sharing options
    1. Sub-sampling
    2. VPN – Protected infrastructure
  6. Conclusion

The purpose of this post was to give a premise of how we protect customer’s data and suggest practical approaches to data anonymization and sharing.

How do you protect this information? What tools and techniques you use? We would love to know.

Thanks,

R

References:

https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/#f56f49460ba9

https://techcrunch.com/2010/08/04/schmidt-data/

https://www.marketingprofs.com/charts/2017/32531/the-incredible-amount-of-data-generated-online-every-minute-infographic

http://www.ehealthinformation.ca/media/e-learning-courses-data-privacy-anonymization/e-learning-course-on-anonymizing-data/

 

 

Who else wants Heat Stream analysis with Power BI?

Namaste! It’s been a tiring month – working on customer projects, building a product prototype, getting work done by my team, phew! – I’m donning multiple hats. Recently we wrapped up two projects on showing heat streams with Power BI. The projects were challenging, and you know customers will take out the best from you. And, it happened with us as well…

Heat streams could be very useful in analyzing large amount of data sets and analyzing patterns or “heats” over a period of time.

Some use cases of heat streams could be:

  1. Analyze call center calls by weekday and time of the day. The time of the day as X-axis and weekdays as Y-axis with the number of calls as “heats”
  2. Perform clickstream analysis for website clicks
  3. Analyze Patient re-admissions and re-admission types in a hospital over a period of years

Usually, in a heat stream visual, we put the time of the day or date or year on X-axis, a discrete or a continuous value on Y-axis, and fill the visual with a discrete or a continuous value with gradient colors.

The code that we developed used ggplot and geom_raster layers along with various settings for formatting axes. This R code combined with Power BI gave us BI capabilities. The visuals were seamlessly sliced/diced based on the data we selected in Power BI. I’m attaching here screenshots of visuals that we created using R and Power BI.

Our customers were wowed by the output they saw from the data. Remember, if data is the new oil, then insights in the new king. And we do this using interesting and stunning visuals.

Screenshot 1:

HeatStreamVisual1

Screenshot 2:

HeatStreamVisual2
Heat stream Visual 2 using Power BI and R

Note: You need large amount of data to have this kind of output. We can further improvise these visuals to be interactive. This can easily be done using plotly and htmlwidgets library combined with Power BI.

The biggest challenge you will face in plotting such visuals is handling large amount of data points on “x-axis”. You may have to use breaks or cuts to limit the points.

Have you plotted heat streams in Power BI/R? What were the most challenging aspects of your project?

We would love to know.

Thank you

R

Note: Next week we will be starting a series of blog posts on how we secure customers data with data anonymization and masking techniques. There are some incredible techniques that we use which give our customers 100% confidence in data security. 

Do subscribe to our blog posts to not miss our proven data masking techniques and other interesting articles.

Note: There is a custom visual for plotting heat streams in Power BI, but it cannot generate heat streams anything like what we have shown above.

Let me show you the secrets of setting up Power BI Premium for embedding scenarios

Hola!

One of our enterprise customers approached us for embedding their Power BI reports, dashboards and Q&A in their application. They had purchased Power BI Premium SKU and want to use Power BI embedded capabilities. In this blog post, we will explain how we helped our customer setup workspace with premium capacity and use it for embedding reports.

A quick recap of embedding your reports in Power BI:

  1. A master account (a user basically) with Power BI pro license in Azure AD tenant
  2. An application in Azure AD and with permissions setup (more on this in next blog)
  3. A workspace (or groups) to publish reports to be used in embedding
  4. User created in #1 to be admin of this newly created workspace

How do we assign premium capacity to this newly created workspace?

1. Go to “Settings icon” in PowerBI.com, and select “Admin portal”

AdminPortal

 

 

 

 

 

 

2. Inside of the Admin portal, select premium settings

PremiumSettings

3. On premium settings screen, select the capacity that you want to use

4. Click on “Assign Workspaces” in the capacity you have selected

AssignWorkspaces

5. You will be presented with a screen, add the user that you created initially (a master user, remember?)

AssignWorkspaces2

6. After this, go to this new workspace, edit it, and ensure in advanced settings “Premium” is ON. You need to have workspace assignment permissions in order to enable it.

PremiumOff

7. When selecting “ON”, select the appropriate Premium capacity that you want to assign to this new workspace

PremiumOn

Hit Save, you are done!

Now this workspace has Premium capacity turned ON. How do you verify it?

Go to this capacity in premium settings and check if this workspace is assigned.

workspacelisted

You are ready to embed your reports. You need to get a token, write some JavaScript and backend code and you are done!

Do you have questions? Let us know.

Contact us if you want to embed Power BI Reports, Dashboards or Q&A. We have helped enterprises, medium to small sized businesses develop and embed Power BI reports using varied sources of data with data sourcing, modeling, and compelling visualizations and analytics.

Or, head to our premium showcase section to see some of our work live in action.

 

Reference: https://www.youtube.com/watch?v=0Cy1V6LYjng

The biggest challenge in applying Machine Learning is….

Machine Learning and AI applications are everywhere. Recently Andrew Ng launched drive.ai – self-driving cars. He aptly said: “The future is here”. But I must say this future arrived pretty quickly!

After talking to our customers and over the period of time we found that there are many challenges before we apply machine learning or AI algorithms. Many of our customers want to see results quickly. We take them through methodical steps to avoid surprises.

Here are few challenges to applying machine learning to solve business problems.

1. Do we know what problems to solve? This is the first question you should ask, and the biggest one. My customers want to train deep learning model on the cloud but when my team asks them deeper questions about what they want to solve, they do not have answers. Or even if they do have answers they are not very clear.

This is where our team’s expertise comes in. We ask questions to our customers to help them understand what problems they want to solve. Questions range from asking about business objectives, what is the current problem, what results are they expecting, what is their vision etc.

2. What data do we have? Machine learning or AI algorithms rely on data. In order to predict future, you need to know past behaviors. In order to know past behavior, you need historic data. In most cases, you have data available. But the question is: is the data relevant? Is it cleaned? If you want to predict a customer’s next purchase you need to have customers historic transactions, demographics details. Another question is, is having data enough?

3. Do you have labeled data? For you to apply ML classification techniques you need to have labeled data. For example, we were working with one of our customers to automatically generate marketing headlines using Deep Learning models, in this problem we need a lot of marketing articles with “good” and “bad” headlines so ML engine knows what is good and bad.

Similarly, in a classic problem of tweet sentiment analysis, you need to label a tweet as positive or negative before ML engine can predict a new tweet’s sentiment.
Who will tag the headlines as “good” or “bad”? Who will put tweet’s sentiment as positive or negative?

4. Do you have trained people? This was the biggest challenge before but now no more. There are many online courses available where one can learn basics and advanced materials. One has to push their limits and learn new materials. I train my team through these courses that are available. Some of them are free while some of them cost as little as $10.

The challenges remain the same customer to customer. They just take different shapes and sizes. We take our customers through methodical steps in solving their business problems. We make hypotheses, test them iteratively, present findings and outcomes, and proceed to next milestone.

It’s always good to take baby steps in Machine Learning and AI Problems.

Please get in touch with us for your data analytics and data science needs.