Skip to content

Microsoft PowerBI#

User needs: Data visualization

User profiles: Business Users, Data Analysts, Data Scientists, Data Engineers

User assumed knowledge: Excel visualization (for data visualization), data modeling (for combining datasets)

Power BI is a Microsoft product that allows you to create several different visualizations of your datasets. These data sets can come from a variety of different sources, online or offline, in the Azure cloud or not.

Power BI is provided to the users in three versions: Power BI Desktop, Power BI Pro and Power BI Premium. The Power BI Desktop is an application that is installed in the Azure Windows Virtual Desktop, and allows you to connect to data sources and create visualizations in the form of dashboards or reports.

For more information regarding the integrations, please visit: https://docs.microsoft.com/en-us/power-bi/connect-data/desktop-data-sources

This section contains the following topics:

How connections are saved in Power BI#

Tools: Power BI Desktop

In the below sections, a description will be provided on how to set up connections to different data sources. These connections, including your connection credentials are saved in your Power BI environment. When due to some reason, you need to update the connection (e.g. you changed your password), follow these steps:

Step 1: In the upper left corner, press “File”

alt-text

Step 2: Click on “Options and settings”, and then on “Data source settings”

alt-text

Step 3: A popup will appear, where you can select which Data Sources you want to Edit or Delete.

alt-text

Logging in to Power BI Pro on the Power BI Desktop application#

Tools: Power BI Desktop, Power BI Pro

The first time you access the Power BI Desktop a pop-up will show, asking you if you want to Sign in. If you have a Power BI Pro license, you can sign in with your email address on which you have a Power BI license. Please note that this email address may be different from the email address you will use to connect to the data of the cloud resources below.

alt-text

Connecting to a Microsoft SQL Server Database#

Tools: Power BI Desktop

When you have a Microsoft SQL Database in your DSL, you can connect to the database via the Power BI Desktop application from your Azure Windows Virtual Desktop.

Step 1: Open the Power BI desktop application from your Azure Windows Virtual Desktop

alt-text

Step 2: When Power BI Desktop opened, click on Get Data in the Home tab, and click on More…

alt-text

Step 3: Click on Azure SQL database and click on Connect

alt-text

Step 4: A popup will show up, requesting you to provide your Server address and the database, which you want to connect to and click on OK. Provide the Server endpoint address of the Microsoft SQL Server, which was provided to you.

For me, this is: testpproject-dev-db-sql-server.database.windows.net

alt-text

Step 5: Next, Power BI will ask you to authenticate with the database.

Here, there are two options: - If you have an SQL database that is joined with AAD, select Microsoft and click on Sign in. - If your SQL database is not joined with AAD, click on Database and provide your User name and Password. - In this case, the user name will be of the format \@\

Note: SQL databases are do not automatically sync users, which means that you cannot use your EC Data Platform credentials by default to authenticate with the database. The admin of the database should first allow your AAD user account to authenticate to the database, or create a database user for you. Please consult with the person with the administrative database credentials to make a user account in the database. See the section of Microsoft SQL Server for more information.

Step 6: In the next window, you can choose the table(s) from the database that you want to load into Power Bi for visualization. Data Scientists can also choose to transform the data before loading it into Power BI.

alt-text

Step 7: When you see the data sources (tables and columns) in the right column of Power BI, you have successfully added the data source. You can now start selecting the visualization and the columns that you like.

alt-text

On the right, there are also some additional options for data modelling, which can be used by data scientists.

How to use Power BI, is out of the scope of this document. However, you can find a lot of information on the official Power BI documentation: https://docs.microsoft.com/en-us/power-bi/guided-learning/

Connecting to a MySQL Database#

Tools: Power BI Desktop

When you have a MySQL Database in your DSL, you can connect to the database via the Power BI Desktop application from your Azure Windows Virtual Desktop.

Step 1: Open the Power BI desktop application from your Azure Windows Virtual Desktop

alt-text

Step 2: When Power BI Desktop opened, click on Get Data in the Home tab, and click on More…

alt-text

Step 3: Click on MySQL Server and click on Connect

alt-text

Step 4: A popup will show up, requesting you to provide your Server address and the database, which you want to connect to and click on OK.

Please note that the Server address is the IP address of the MySQL database, which was provided to you.

alt-text

Step 5: Next, Power BI will ask you to authenticate with the database. Select “Database” as the option to authenticate and fill in the following information: - User name: \@\ - Password: \

Note: MySQL databases are not synced with Azure Active Directory, which means that you cannot use your EC Data Platform credentials to authenticate with the database. If you are a user, ask the person with the administrative database credentials to make a user account in the database. Please see section MySQL for more information.

alt-text

Step 6: In the next window, you can choose the table(s) from the database that you want to load into Power Bi for visualization. Data Scientists can also choose to transform the data before loading it into Power BI.

alt-text

Step 7: When you see the data sources (tables and columns) in the right column of Power BI, you have successfully added the data source. You can now start selecting the visualization and the columns that you like.

alt-text

On the right, there are also some additional options for data modelling, which can be used by data scientists.

How to use Power BI, is out of the scope of this document. However, you can find a lot of information on the official Power BI documentation: https://docs.microsoft.com/en-us/power-bi/guided-learning/

Connecting to a PostgreSQL#

Tools: Power BI Desktop

When you have a Azure Database for PostgreSQL in your DSL, you can connect to the database via the Power BI Desktop application from your Azure Windows Virtual Desktop.

Step 1: Open the Power BI desktop application from your Azure Windows Virtual Desktop

alt-text

Step 2: When Power BI Desktop opened, click on Get Data in the Home tab, and click on More…

alt-text

Step 3: Click on Azure Database for PostgreSQL and click on Connect

alt-text

Step 4: A popup will show up, requesting you to provide your Server address and the database, which you want to connect to and click on OK.

Provide the Server endpoint address of the Azure Database for PostgreSQL, which was provided to you.

For me, this is: testppproject-dev-db-postgres.postgres.database.azure.com

Additionally, provide the database you want to connect to.

alt-text

Step 5: Next, Power BI will ask you to authenticate with the database. Select “Database” as the option to authenticate and fill in the following information: - User name: \@\ - Password: \

Note: PostgreSQL databases are not synced with Azure Active Directory, which means that you cannot use your EC Data Platform credentials to authenticate with the database. If you are a user, ask the person with the administrative database credentials to make a user account in the database. Please see section PostgreSQL for more information.

alt-text

Step 6: In the next window, you can choose the table(s) from the database that you want to load into Power BI for visualization. Data Scientists can also choose to transform the data before loading it into Power BI.

alt-text

Step 7: When you see the data sources (tables and columns) in the right column of Power BI, you have successfully added the data source. You can now start selecting the visualization and the columns that you like.

alt-text

On the right, there are also some additional options for data modelling, which can be used by data scientists.

How to use Power BI, is out of the scope of this document. However, you can find a lot of information on the official Power BI documentation: https://docs.microsoft.com/en-us/power-bi/guided-learning/

Connecting to an Azure Data Lake Gen2#

Tools: Power BI Desktop

When you have an Azure Data Lake Gen2 in your DSL, you can connect to the data storage via the Power BI Desktop application from your Azure Windows Virtual Desktop.

Step 1: Open the Power BI desktop application from your Azure Windows Virtual Desktop

alt-text

Step 2: When Power BI Desktop opened, click on Get Data in the Home tab, and click on More…

alt-text

Step 3: Click on Azure Data Lake Storage Gen2 and click on Connect.

alt-text

Step 4: The next window requests you for a URL. Here, you can provide the URL of the Azure Data Lake directory, or a file in the Azure Data Lake container or directory. If you choose a directory, you will be able to choose any of the files that are stored in this container or directory, or any sub-directory.

Example: - Azure Data Lake directory: https://uatprojectuatdatalake.dfs.core.windows.net/HDInsight/User-read-write - Azure Data Lake file: https://uatprojectuatdatalake.dfs.core.windows.net/HDInsight/User-read-write/cordis-h2020projectPublications.csv

alt-text

Click on “Sign in” and authenticate with your EC Data Platform credentials.

alt-text

After authentication, the Wizard will indicate that you are currently signed in. You can then proceed to Connect.

alt-text

Step 5: A preview of the different files, that you can access, is visualized.

alt-text

Click on “Load” to load the meta data of these files into Power BI.

Step 6: You will see that a query is added to the “Fields” plane in Power BI

alt-text

Step 7: In the “Fields” part, right-click on the Query that is provided and click on “Edit query”:

alt-text

Step 8: A new screen, the “Power Query Editor” will open up. In here, you can see al the files that are included in the directory that you synced with Power BI. If you want to select multiple files from this list, right-click on the “Binary” content and on “Add as New Query”. If you’re only interested in one table, left-click on the “Binary”.

If you right-clicked and created a new Query, you first have to double-click the file first:

alt-text

Step 9: Power BI will automatically create column types for your data set, based on the first 200 rows. If you additionally want to change the automatic steps that Power BI took, you can adjust these and other settings (the name, the filters, the data types,…) from this Power Query Editor.

alt-text

Step 10: Finally, click on Close and Apply to save the connection to your Azure Data Lake Gen2 file. You can now create visualizations and select the fields you want to include.

alt-text

The further use of Power BI is out of the scope of this document. However, you can find a lot of information on the official Power BI documentation: https://docs.microsoft.com/en-us/power-bi/guided-learning/

Connecting to HDInsight Hive tables#

HDInsight should always use the underlying data lake to store the outputs of transformations via the HDInsight cluster. We would therefore recommend the use of the Data Lake Gen2 connection to Power BI to visualize the outputs.

If you are using the Hive tables, this section will explain how you can sync these tables with your Power BI environment. For more information around Hive tables, please visit: https://mapr.com/products/apache-hive/. In our HDInsight environment, there is one example Hive table available, called the “hivesampletable”. In this section, we will sync this table with our Power BI environment on Azure Windows Virtual Desktop.

Step 1: Download and install the ODBC HDinsight connector: Microsoft® Hive ODBC Driver (The 64-bit version is normally pre-installed on your Azure Windows Virtual Desktop).

Step 2: In the Windows search pane, enter ODBC and click on the ODBC Data Sources (and the bit version you choose when installing).

alt-text

Step 3: In the User DSN, click on Add…

alt-text

Step 4: If the installation from step 1 was successful, you should see that you can choose Microsoft Hive ODBC Driver to create a new data source:

alt-text

Step 5: In the next visualization, fill in the following information: - Data Source Name: Give a name to your data source - Host(s) Enter <HDInsightClusterName>.azurehdinsight.net. - Example: uat2project-uat-hdi-cluster.azurehdinsight.net. - Port: Use 443. - Database: Use default. - Mechanism: Select Windows Azure HDInsight Service - User Name: Enter your EC Data Platform email address - Password: Enter your EC Data Platform password

alt-text

Next, click on Test.

Please note that if MFA is enabled for your account, you are not able to create ODBC connections. Contact your support team to remove the MFA if this is the case. The test result should say something similar to:

alt-text

Next, click on OK , then again on OK to add the data source, and finally OK to close the ODBC connections.

Step 6: Open the Power BI desktop application from your Azure Windows Virtual Desktop

alt-text

Step 7: When Power BI Desktop opened, click on Get Data in the Home tab, and click on More…

alt-text

Step 6: Search for ODBC and select the ODBC connector. Click Connect.

alt-text

Step 7: The name that you gave to the ODBC connection should appear. Click ok OK.

alt-text

Step 8: Next, a window will pop up requesting your User name and Password again. Provide your EC Data Platform email address and password, and click on Connect.

alt-text

Step 9: The Navigator window will pop-up. Select the Hive tables you want to sync to Power BI and click on Load.

alt-text

Step 10: You can now create visualizations and select the fields you want to include.

alt-text

The further use of Power BI is out of the scope of this document. However, you can find a lot of information on the official Power BI documentation: https://docs.microsoft.com/en-us/power-bi/guided-learning/

Publishing and sharing dashboards and reports#

Tools: Power BI Desktop, Power BI Pro

To publish and share reports or dashboards, you need to have a Power BI Pro license.

As an example, a small report was made, based on the “hivesampletable” from the HDInsight Hive table section.

alt-text

Step 1: In the upper-left corner, click on File.

alt-text

Step 2: Click on Publish and then on Publish to Power BI

alt-text

You might be asked to save your file locally, if this is not the case yet. Additionally, make sure you are logged in with a user that has a Power BI license, see “Logging in to Power BI Pro on the Power BI Desktop application”.

Step 3: Select the destination to which you want to publish the report. “My workspace” is the default folder. Click on Select.

alt-text

When all goes well, you should see a similar view, indicating that you successfully published your report or dashboard.

alt-text

Step 4: Access the Power BI Pro web interface via https://powerbi.microsoft.com/en-us/landing/signin/ and log in with the same credentials as with which you were logged in to the Power BI Desktop version when you published the report.

alt-text

Step 5: Inside the Power BI Pro interface, click on Reports and click on the report that you created before.

alt-text

You should be able to view your report now.

alt-text

Step 6: If you want to share your report with other people, click on the Share button on the upper-right corner.

In the Share tab you can specify the email addresses, and optionally a message, to share the report or dashboard with other people. Additionally, you can also just copy the Report link and share this directly with the relevant people.

Note: You can only share reports or dashboards with users that also have a Power BI Pro license

alt-text

Finally, in the Access tab you can see who has access and manage this access further. Additionally, you can create a report on the views that the report has received.

alt-text

Note on refreshing data: Power BI Pro makes it possible to publish visualizations to make them available to be viewed by yourself or others, from a web browser or mobile device. An important remark to make here is that, in order to refresh the data, the data needs to be available from an online connection. This means that visualizations of resources that are only available from the private data platform network, cannot be refreshed (e.g. the MySQL database). Additionally, connections that require an ODBC connection (e.g. the HDInsight tables) can also not be refreshed. As the Azure Data Lake Gen2 is currently publicly accessible by default, reports that use this data as a basis, can be refreshed.