Azure Machine Learning#

The Azure Machine Learning (Azure ML) workbenches provide the cloud-based computing capabilities and the applications required to perform standard and advanced machine learning activities on data from other cloud data sources such as data lakes and databases.

The Azure ML Studio provides a dedicated environment to model, train and deploy ML models. Additionally, the service includes the (optional) provisioning of compute resources that are specialized for ML workloads (CUDA and OpenCL enabled GPUs). Finally, the service interoperates with popular (open source) machine learning frameworks like PyTorch, TensorFlow, sciki-learn, etc.

Interfaces#

The interface in which the Azure ML service is provided, is called the Azure ML Studio. This is an environment which is separated from the Azure Portal and can be accessed from with https://ml.azure.com link from your favorite web browser.

alt-text

From within the Studio, there are multiple ways to create, train and deploy your ML models:

Notebooks are available with Python and R enabled
AutoML feature to automatically pre-processes and trains popular machine learning models in the wild applied on your data
Designer feature to create ML pipelines graphically that can be executed in parallel.

Hardware configuration#

For the underlying compute hardware of your AzureML environment, you can choose between all available compute instance types that Azure offers. For an updated list of all available instance types, please visit the Microsoft documentation.

A separation should be made between the compute instances which are used to host and process the notebooks, and the compute clusters which are used to execute the submitted training jobs. The compute instances can be started and stopped by the user whenever the Studio's Notebook need to be used. Compute clusters are automatically in a stopped state until a training job is submitted to this cluster. After the job is submitted, the cluster will automatically start virtual machine to process the training job. After the job has finalized, the compute cluster will automatically scale down to a stopped state.