In this article, we are going to learn about working with Python in Power BI. Power BI has been adding a lot of features to it recently and one of the most essential was adding Python functionality to it. Users are now able to run Python scripts directly from within Power BI. Python is a great programming language, especially when it comes to the field of data analysis. Most of the machine learning frameworks and data science libraries are written in Python and there is a huge community of open-source programmers that maintain and promote the use of Python along with these libraries.
In such a world, where more and more companies and users are focussing on open-source technologies, the addition of scripting in Python within Power BI takes it to another level where you can create data analysis scripts within the Power BI. You can also create datasets from scratch using Python in Power BI. Additionally, Python has a rich set of data visualization libraries like Seaborn and Matplotlib that allows a rich set of visuals that are not currently supported by Power BI. You can also use the Pandas and NumPy libraries to perform some data analysis or scientific calculations using Python in Power BI.
Pre-requisites to work with Python in Power BI
In order to work with Python in Power BI, we need to have the following pre-requisites.
- Python runtime installed – The execution runtime on which the Python scripts will be executed
- Install the required libraries – We need to install Pandas and Seaborn to work with
- Visual Studio Code (optional) – A code editor to efficiently write Python scripts. This is optional as you can also write Python scripts in the Power BI Script editor. However, personally, I recommend using a code editor as it supports syntax highlighting and other essential features
- Updating Power BI settings – Additionally, we need to update the settings in the Power BI to be able to work with Python in Power BI
Downloading and Installing Python
In order to work with Python in Power BI, we need to get the runtime of the latest version of Python installed. You can get the recommended installer from the official website. Once downloaded, you can get it installed on your machine by simply following the installation guide. After successful installation, you can check the version of Python by running the following command from the command prompt:
1 |
Python --version |
Figure 1 – Checking the Python version after installation
Installing the required libraries
In order to work with Python, we need to install the libraries. Power BI only supports Pandas data frames at the moment and hence we need to get it installed. You can use the following command to install pandas on your machine:
1 |
pip install pandas |
Figure 2 – Installing Pandas library in Python to work in Power BI
Since I have already installed the library earlier, it says “Requirement already satisfied“.
Similarly, you can also install Seaborn using the following command:
1 |
pip install seaborn |
Figure 3 – Installing Seaborn library in Python to work in Power BI
Downloading and Installing Visual Studio Code
Although this is an optional step, you can download Visual Studio Code from the official website. Simply download the executable and run the installer. Once done, you can open Visual Studio Code from the Start Menu.
Figure 4 – Installed Visual Studio Code to efficiently write Python scripts
Update settings in Power BI Desktop
The final step as a pre-requisite is to update the setting in the Power BI Desktop. This will enable scripting with Python in Power BI. Open Power BI Desktop and click on File. Select Options and settings and then Options. This will open up a new Options dialog box in Power BI.
Figure 5 – Setting up Python Scripting in Power BI
In the Options dialog box, click on Python scripting and then select the path for both home directories and the Python IDE. For the home directory, you need to provide the path where Python has been installed to. For the Python IDE, you need to provide the path where Visual Studio Code has been installed to. Click on OK once completed.
Creating and running Python Scripts
Now that we have completed the pre-requisites, we can now start writing Python scripts and use them in Power BI. Let us first create a simple script and ensure it is working before importing it into Power BI. For the purpose of this tutorial, I am going to create a simple CSV file with some data and import it as a data frame in Python. You can use the following script to load the data into a Pandas data frame:
https://gist.github.com/aveek22/f6364841c091a22934ab63b966bbf951
The sample dataset that I am using for this tutorial can be found here:
https://gist.github.com/aveek22/bc5adc256a032c271d10a6f679a3b532
Once you execute the Python script, the following output will be obtained on the console.
Figure 6 – Python Script Console Output for Pandas Data frame
As you can see in the above figure, the Pandas data frame has been printed on the console. This means the script works fine and we are good to import it in Power BI.
Open Power BI Desktop and click on Get Data. Select Other and then select Python Script from the list of data sources. Click on Connect.
Figure 7 – Get Data using Python Script in Power BI
A new dialog box for writing the Python script will open. You can copy the script from your code editor to the script box in Power BI. Click OK once done.
Figure 8 – Writing the Python script
Once you click OK, Power BI is going to initiate an instance of the Python and run the script provided to it. The script will read the CSV file from the location and then prepare the Pandas data frame which is then loaded into Power BI.
Figure 9 – Python Data Frame being imported as a table in Power BI
As soon as you load the data into Power BI, the data model is created, and the fields are loaded in the data section. You can start creating your visualizations with this data now.
Figure 10 – Creating Power BI Visuals with Python Data Frame
Using this method, you can create complex data frames in Python and then import them into Power BI. Additionally, you can also perform data manipulation using the Pandas library and then once the final dataset is created, it can then be imported into Power BI to create the visuals from it. This will make using Power BI easier for a lot of Data Scientists and Data Analysts who are already experts in manipulating data using Python.
The data that we have just imported using Python can be normally refreshed by hitting the Refresh button on the ribbon. This will execute the Python script once again and read the data from the CSV file and refresh the data model.
Conclusion
In this article, we have learned how to use Python in Power BI. Python is one of the most popular open-source programming languages that are being used heavily in the field of data science and machine learning. Using Python in Power BI, you can extend the normal capabilities of the tool and enhance the way Power BI works. You can perform external data analysis using frameworks like Pandas, run external Python scripts from Power BI, create and manipulate datasets, create visualizations using Python libraries, and a lot more. To learn more about using Python in Power BI, I would recommend studying the official documentation from Microsoft.
- Getting started with PostgreSQL on Docker - August 12, 2022
- Getting started with Spatial Data in PostgreSQL - January 13, 2022
- An overview of Power BI Incremental Refresh - December 6, 2021