Introduction
Let us see how Azure ML studio can be used to create machine learning models and how to consume them in this series. As we discussed during the data mining series, we identified the challenges in the predictions in data. In the Azure Machine learning platform, machine learning workflows can be defined in easy scale models in the cloud environment. Today will be looking at how datasets can be uploaded.
MLOPS
DevOps is a very familiar tool among IT practitioners nowadays so the development and operation teams can get together and work to the success of the project. Similarly, in Machine Learning, there are different teams such as data scientists, data engineers, and development teams working on machine learning projects to work on models and consuming them. Azure machine learning can utilize the MLOps model to build high-quality and scalable machine learning models that are equivalent to the DevOps.
Further, if you look at the machine learning development life cycle, you need multiple tasks such as,
- Pre-processing data
- Preparing data
- Developing candidate ML models
- Evaluating candidate ML models
- Choosing an ML model
- Deploying the selected Machine learning model
- Consuming the ML model
Azure Machine Learning will provide different users at different tasks in the development life cycle of machine learning.
Azure Machine Learning
To facilitate all the above tasks, you can use the Azure Machine Learning Studio which is the browser-based workbench for Machine Learning. You can create your free account to try out many features of the Azure Machine Learning Platform. For example, in the free account maximum storage is 10 GB whereas there is no limit in the paid account. Apart from the storage limitation, the free account will execute on a single node and the paid account is running on multiple nodes. Apart from those limitations, the free account does not require an Azure subscription.
You can look at the details of limitations from the following URL as these limits will change from time to time:
https://azure.microsoft.com/en-us/pricing/details/machine-learning-studio/
After an account is created, you can log in to https://studio.azureml.net/ where it provides a lot of videos and important documentation for a novice user.
If you are creating a machine learning resource form the Azure Portal, in the AI + Machine Learning category choose the machine learning resource as shown in the below image:
Let us provide the basic details for the machine learning recourse as shown in the below image:
In the above screen, you need to provide the Azure subscription and resource group. The region will select a data center that will execute your machine learning projects. Unlike most of the other azure services, Azure machine learning does not exist in all the data centers around the world.
Next is to create tags for the azure machine learning services as shown in the below screenshot:
The above tags will be used for billing and costing purposes.
Let us use the https://studio.azureml.net for the demo purposes.
Azure ML Studio
Azure Machine Learning Studio has multiple options as shown in the below figure:
A Project is a collection of multiple assets such as Datasets, experiments, etc and a project need a project name as description as shown in the below screenshot:
Previously created data sets and experiments etc can be added to the project.
Experiments are the models that will be created by data scientists to predict and model the data. When a new Experiment is created, you can choose from the existing sample experiment templates or if you want to start from the beginning, you can use the Blank Experiment:
When the Blank Experiment is selected, the following screen can be seen in the Azure ML studio:
You can drag and drop the experiment items that will be discussed in detail during the upcoming article in this series.
Web Services can be created so that they can be consumed by different users in different applications. These web services can be consumed from Microsoft Excel as well which will be looked at during this article series.
Data Sets are the data that you will be working on. For you to try out options in Azure Machine Learing studio, there are a lot of real-world data samples as shown in the below screen:
We will be using some of these data samples in future articles to demonstrate different machine learning techniques. You can download the data set or you can add a dataset to a project.
If you have a data set, you can upload the data set using New option in the Dataset option as shown in the below screenshot:
For the above dataset import, the iris sample data set of WEKA is used. Apart from arff files, CSV, TSV, text files, Zip files can be uploaded to the dataset.
You can now observe your data sets in the left-hand side list as shown in the below screen:
You can find the basic statistical properties of the data set by drag and drop the data set to an experiment:
You can view the statistics of a dataset by selecting the Visualize option as shown in the below screen:
After visualize option is selected, the statistical properties of data can be viewed.
By clicking any column, you can view the statistical details such as mean, median, minimum, maximum, standard deviation, Unique values, missing values as shown below:
You can either view the data in bar charts or boxplots by choosing the necessary option as shown in the below screenshot:
In the above diagram, data is distributed to 10 bins that is the default value. The bin value can be modified according to your need. The frequency of values can be converted to the log scale as well.
Another important feature of the Azure ML Studio is the ability to compare values so that you can get an idea of your data set. For example, let us say we want to find the distribution of the petal width for different classes of iris flowers, you can choose the class as the comparable attribute and the following screen will be visible:
By looking at the above screen, data distribution can be easily identified.
In the Trained model tab, you will find the models that were trained and in the Setting tab, you will have the options to manage your projects:
You can view the details for the workspace from the Name tab whereas the Users tab provides you the option of sharing the experiments and projects with your peers to support the MLOps as we discussed previously. Those users should have windows live account and when they are added, they will be notified via email.
Data Gateways provide you with the option of accessing the on-premises database. However, your on-premises database feature is not available for the “Free” tier and you need to upgrade to the “Standard” tier to access data from on-premises.
Conclusion
Azure ML Studio provides MLOps capabilities for the different users in machine learning projects. Apart from building scalable ML models, users can upload their own data set and observe the statistical properties of the dataset.
Table of contents
- Testing Type 2 Slowly Changing Dimensions in a Data Warehouse - May 30, 2022
- Incremental Data Extraction for ETL using Database Snapshots - January 10, 2022
- Use Replication to improve the ETL process in SQL Server - November 4, 2021