Install the Azure Machine Learning SDK for Python
This package is used by azureml-train-automl-client and azureml-train-automl-runtime.
If you're looking to submit automated ML runs on a remote compute and don't need do any ML locally, we recommend using the thin client, , package that is part of the .
See the additional use-case guidance for more information on installation and working with the full SDK or its thin client, .
Similar to the Python standard, one version backwards and one version forward compatibility is supported, but only for the full package. For example, if a model is trained with SDK version 1.29.0, then you can inference with SDK versions between 1.28.0 and 1.30.0.
Thin client for remote compute:
Azure Machine Learning Python SDK notebooks
a community-driven repository of examples using mlflow for tracking can be found at https://github.com/Azure/azureml-examples
Welcome to the Azure Machine Learning Python SDK notebooks repository!
These notebooks are recommended for use in an Azure Machine Learning Compute Instance, where you can run them without any additional set up.
However, the notebooks can be run in any development environment with the correct packages installed.
Install the Python package:
Install additional packages as needed:
We recommend starting with one of the quickstarts.
This repository is a push-only mirror. Pull requests are ignored.
Code of Conduct
This project has adopted the Microsoft Open Source Code of Conduct. Please see the code of conduct for details.
AzureML provides two basic assets for working with data:
Provides an interface for numerous Azure Machine Learning storage accounts.
Each Azure ML workspace comes with a default datastore:
which can also be accessed directly from the Azure Portal (under the same resource group as your Azure ML Workspace).
Datastores are attached to workspaces and are used to store connection information to Azure storage services so you can refer to them by name and don't need to remember the connection information and secret used to connect to the storage services.
Use this class to perform management operations, including register, list, get, and remove datastores.
A dataset is a reference to data - either in a datastore or behind a public URL.
Datasets provide enhaced capabilities including data lineage (with the notion of versioned datasets).
Each workspace comes with a default datastore.
Connect to, or create, a datastore backed by one of the multiple data-storage options that Azure provides. For example:
- Azure Blob Container
- Azure Data Lake (Gen1 or Gen2)
- Azure File Share
- Azure MySQL
- Azure PostgreSQL
- Azure SQL
- Azure Databricks File System
See the SDK for a comprehensive list of datastore types and authentication options: Datastores (SDK).
Register a new datastore#
To register a store via an account key:
To register a store via a SAS token:
Connect to datastore#
The workspace object has access to its datastores via
Any datastore that is registered to workspace can thus be accessed by name.
Link datastore to Azure Storage Explorer#
The workspace object is a very powerful handle when it comes to managing assets the workspace has access to. For example, we can use the workspace to connect to a datastore in Azure Storage Explorer.
For a datastore that was created using an account key we can use:
For a datastore that was created using a SAS token we can use:
The account_name and account_key can then be used directly in Azure Storage Explorer to connect to the Datastore.
Move data to and from your AzureBlobDatastore object .
Upload to Blob Datastore#
The AzureBlobDatastore provides APIs for data upload:
Alternatively, if you are working with multiple files in different locations you can use
Download from Blob Datastore#
Download the data from the blob container to the local file system.
Via Storage Explorer#
Azure Storage Explorer is free tool to easily manage your Azure cloud storage resources from Windows, macOS, or Linux. Download it from here.
Azure Storage Explorer gives you a (graphical) file exporer, so you can literally drag-and-drop files into and out of your datastores.
See "Link datastore to Azure Storage Explorer" above for more details.
Read from Datastore#
Reference data in a in your code, for example to use in a remote setting.
First, connect to your basic assets: , and .
Create a , either as mount:
or as download:
To mount a datastore the workspace need to have read and write access to the underlying storage. For readonly datastore is the only option.
Consume DataReference in ScriptRunConfig#
Add this DataReference to a ScriptRunConfig as follows.
The command-line argument returns the environment variable . Finally, instructs the run to mount the data to the compute target and to assign the above environment variable appropriately.
Without specifying argument#
Specify a to reference your data without the need for command-line arguments.
From local data#
You could create and register a dataset directly from a folder on your local machine. Note that must point to a folder, not file.
⚠️ Method : This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
From a datastore#
The code snippet below shows how to create a given a relative path on . Note that the path could either point to a folder (e.g. ) or a single file (e.g. ).
From outputs using #
Upload to datastore#
To upload a local directory :
This will upload the entire directory from local to the default datastore associated to your workspace .
Create dataset from files in datastore#
To create a dataset from a directory on a datastore at :
To reference data from a dataset in a ScriptRunConfig you can either mount or download the dataset using:
- : mount dataset to a remote run
- : download the dataset to a remote run
Path on compute Both and accept an (optional) parameter . This defines the path on the compute target where the data is made available.
- If , the data will be downloaded into a temporary directory.
- If starts with a it will be treated as an absolute path. (If you have specified an absolute path, please make sure that the job has permission to write to that directory.)
- Otherwise it will be treated as relative to the working directory
Reference this data in a remote run, for example in mount-mode:
and consumed in :
For more details: ScriptRunConfig
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size azureml-0.2.7-py2.py3-none-any.whl (23.9 kB)||File type Wheel||Python version 3.5||Upload date||Hashes View|
|Filename, size azureml-0.2.7.zip (24.9 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for azureml-0.2.7-py2.py3-none-any.whl
Hashes for azureml-0.2.7.zip
What is the Azure Machine Learning SDK for Python?
Data scientists and AI developers use the Azure Machine Learning SDK for Python to build and run machine learning workflows with the Azure Machine Learning service. You can interact with the service in any Python environment, including Jupyter Notebooks, Visual Studio Code, or your favorite Python IDE.
Key areas of the SDK include:
- Explore, prepare and manage the lifecycle of your datasets used in machine learning experiments.
- Manage cloud resources for monitoring, logging, and organizing your machine learning experiments.
- Train models either locally or by using cloud resources, including GPU-accelerated model training.
- Use automated machine learning, which accepts configuration parameters and training data. It automatically iterates through algorithms and hyperparameter settings to find the best model for running predictions.
- Deploy web services to convert your trained models into RESTful services that can be consumed in any application.
For a step-by-step walkthrough of how to get started, try the tutorial.
The following sections are overviews of some of the most important classes in the SDK, and common design patterns for using them. To get the SDK, see the installation guide.
Stable vs experimental
The Azure Machine Learning SDK for Python provides both stable and experimental features in the same SDK.
|Stable features||Production ready|
These features are recommended for most use cases and production environments. They are updated less frequently then experimental features.
These features are newly developed capabilities & updates that may not be ready or fully tested for production usage. While the features are typically functional, they can include some breaking changes. Experimental features are used to iron out SDK breaking bugs, and will only receive updates for the duration of the testing period.
As the name indicates, the experimental features are for experimenting and is not considered bug free or stable. For this reason, we only recommend experimental features to advanced users who wish to try out early versions of capabilities and updates, and intend to participate in the reporting of bugs and glitches.
Experimental features are labelled by a note section in the SDK reference.
The class is a foundational resource in the cloud that you use to experiment, train, and deploy machine learning models. It ties your Azure subscription and resource group to an easily consumed object.
View all parameters of the create Workspace method to reuse existing instances (Storage, Key Vault, App-Insights, and Azure Container Registry-ACR) as well as modify additional settings such as private endpoint configuration and compute target.
Import the class and create a new workspace by using the following code. Set to if you have a previously existing Azure resource group that you want to use for the workspace. Some functions might prompt for Azure authentication credentials.
Use the same workspace in multiple environments by first writing it to a configuration JSON file. This saves your subscription, resource, and workspace name data.
Load your workspace by reading the configuration file.
Alternatively, use the static method to load an existing workspace without using configuration files.
The variable represents a object in the following code examples.
The class is another foundational cloud resource that represents a collection of trials (individual model runs). The following code fetches an object from within by name, or it creates a new object if the name doesn't exist.
Run the following code to get a list of all objects contained in .
Use the function to retrieve a list of objects (trials) from . The following code retrieves the runs and prints each run ID.
There are two ways to execute an experiment trial. If you're interactively experimenting in a Jupyter notebook, use the function. If you're submitting an experiment from a standard Python environment, use the function. Both functions return a object. The variable represents an object in the following code examples.
A run represents a single trial of an experiment. is the object that you use to monitor the asynchronous execution of a trial, store the output of the trial, analyze results, and access generated artifacts. You use inside your experimentation code to log metrics and artifacts to the Run History service. Functionality includes:
- Storing and retrieving metrics and data.
- Using tags and the child hierarchy for easy lookup of past runs.
- Registering stored model files for deployment.
- Storing, modifying, and retrieving properties of a run.
Create a object by submitting an object with a run configuration object. Use the parameter to attach custom categories and labels to your runs. You can easily find and retrieve them later from .
Use the static function to get a list of all objects from . Specify the parameter to filter by your previously created tag.
Use the function to retrieve the detailed output for the run.
Output for this function is a dictionary that includes:
- Run ID
- Start and end time
- Compute target (local versus cloud)
- Dependencies and versions used in the run
- Training-specific data (differs depending on model type)
For more examples of how to configure and monitor runs, see the how-to.
The class is used for working with cloud representations of machine learning models. Methods help you transfer models between local development environments and the object in the cloud.
You can use model registration to store and version your models in the Azure cloud, in your workspace. Registered models are identified by name and version. Each time you register a model with the same name as an existing one, the registry increments the version. Azure Machine Learning supports any model that can be loaded through Python 3, not just Azure Machine Learning models.
The following example shows how to build a simple local classification model with , register the model in , and download the model from the cloud.
Create a simple classifier, , to predict customer churn based on their age. Then dump the model to a file in the same directory.
Use the function to register the model in your workspace. Specify the local model path and the model name. Registering the same name more than once will create a new version.
Now that the model is registered in your workspace, it's easy to manage, download, and organize your models. To retrieve a model (for example, in another environment) object from , use the class constructor and specify the model name and any optional parameters. Then, use the function to download the model, including the cloud folder structure.
Use the function to remove the model from .
After you have a registered model, deploying it as a web service is a straightforward process. First you create and register an image. This step configures the Python environment and its dependencies, along with a script to define the web service request and response formats. After you create an image, you build a deploy configuration that sets the CPU cores and memory parameters for the compute target. You then attach your image.
ComputeTarget, RunConfiguration, and ScriptRunConfig
The class is the abstract parent class for creating and managing compute targets. A compute target represents a variety of resources where you can train your machine learning models. A compute target can be either a local machine or a cloud resource, such as Azure Machine Learning Compute, Azure HDInsight, or a remote virtual machine.
Use compute targets to take advantage of powerful virtual machines for model training, and set up either persistent compute targets or temporary runtime-invoked targets. For a comprehensive guide on setting up and managing compute targets, see the how-to.
The following code shows a simple example of setting up an (child class of ) target. This target creates a runtime remote compute resource in your object. The resource scales automatically when a job is submitted. It's deleted automatically when the run finishes.
Reuse the simple churn model and build it into its own file, , in the current directory. At the end of the file, create a new directory called . This step creates a directory in the cloud (your workspace) to store your trained model that serialized.
Next you create the compute target by instantiating a object and setting the type and size. This example uses the smallest resource size (1 CPU core, 3.5 GB of memory). The variable contains a list of supported virtual machines and their sizes.
Create dependencies for the remote compute resource's Python environment by using the class. The file is using and , which need to be installed in the environment. You can also specify versions of dependencies. Use the object to set the environment in .
Now you're ready to submit the experiment. Use the class to attach the compute target configuration, and to specify the path/file to the training script . Submit the experiment by specifying the parameter of the function. Call on the resulting run to see asynchronous run output as the environment is initialized and the model is trained.
The following are limitations around specific characters when used in parameters:
- The , , , and characters are escaped by the back end, as they are considered reserved characters for separating bash commands.
- The , , , , , , , , and characters are escaped for local runs on Windows.
After the run finishes, the trained model file is available in your workspace.
Azure Machine Learning environments specify the Python packages, environment variables, and software settings around your training and scoring scripts. In addition to Python, you can also configure PySpark, Docker and R for environments. Internally, environments result in Docker images that are used to run the training and scoring processes on the compute target. The environments are managed and versioned entities within your Machine Learning workspace that enable reproducible, auditable, and portable machine learning workflows across a variety of compute targets and compute types.
You can use an object to:
- Develop your training script.
- Reuse the same environment on Azure Machine Learning Compute for model training at scale.
- Deploy your model with that same environment without being tied to a specific compute type.
The following code imports the class from the SDK and to instantiates an environment object.
Add packages to an environment by using Conda, pip, or private wheel files. Specify each package dependency by using the class to add it to the environment's .
The following example adds to the environment. It adds version 1.17.0 of . It also adds the package to the environment, . The example uses the method and the method, respectively.
To submit a training run, you need to combine your environment, compute target, and your training Python script into a run configuration. This configuration is a wrapper object that's used for submitting runs.
When you submit a training run, the building of a new environment can take several minutes. The duration depends on the size of the required dependencies. The environments are cached by the service. So as long as the environment definition remains unchanged, you incur the full setup time only once.
The following example shows where you would use as your wrapper object.
If you don't specify an environment in your run configuration before you submit the run, then a default environment is created for you.
See the Model deploy section to use environments to deploy a web service.
An Azure Machine Learning pipeline is an automated workflow of a complete machine learning task. Subtasks are encapsulated as a series of steps within the pipeline. An Azure Machine Learning pipeline can be as simple as one step that calls a Python script. Pipelines include functionality for:
- Data preparation including importing, validating and cleaning, munging and transformation, normalization, and staging
- Training configuration including parameterizing arguments, filepaths, and logging / reporting configurations
- Training and validating efficiently and repeatably, which might include specifying specific data subsets, different hardware compute resources, distributed processing, and progress monitoring
- Deployment, including versioning, scaling, provisioning, and access control
- Publishing a pipeline to a REST endpoint to rerun from any HTTP library
A is a basic, built-in step to run a Python Script on a compute target. It takes a script name and other optional parameters like arguments for the script, compute target, inputs and outputs. The following code is a simple example of a . For an example of a script, see the tutorial sub-section.
After at least one step has been created, steps can be linked together and published as a simple automated pipeline.
For a comprehensive example of building a pipeline workflow, follow the advanced tutorial.
Pattern for creating and using pipelines
An Azure Machine Learning pipeline is associated with an Azure Machine Learning workspace and a pipeline step is associated with a compute target available within that workspace. For more information, see this article about workspaces or this explanation of compute targets.
A common pattern for pipeline steps is:
- Specify workspace, compute, and storage
- Configure your input and output data using
- Dataset which makes available an existing Azure datastore
- PipelineDataset which encapsulates typed tabular data
- PipelineData which is used for intermediate file or directory data written by one step and intended to be consumed by another
- Define one or more pipeline steps
- Instantiate a pipeline using your workspace and steps
- Create an experiment to which you submit the pipeline
- Monitor the experiment results
This notebook is a good example of this pattern. job
For more information about Azure Machine Learning Pipelines, and in particular how they are different from other types of pipelines, see this article.
Use the class to configure parameters for automated machine learning training. Automated machine learning iterates over many combinations of machine learning algorithms and hyperparameter settings. It then finds the best-fit model based on your chosen accuracy metric. Configuration allows for specifying:
- Task type (classification, regression, forecasting)
- Number of algorithm iterations and maximum time per iteration
- Accuracy metric to optimize
- Algorithms to blacklist/whitelist
- Number of cross-validations
- Compute targets
- Training data
Use the extra in your installation to use automated machine learning.
For detailed guides and examples of setting up automated machine learning experiments, see the tutorial and how-to.
The following code illustrates building an automated machine learning configuration object for a classification model, and using it when you're submitting an experiment.
Use the object to submit an experiment.
After you submit the experiment, output shows the training accuracy for each iteration as it finishes. After the run is finished, an object (which extends the class) is returned. Get the best-fit model by using the function to return a object.
The class is for configuration settings that describe the environment needed to host the model and web service.
is the abstract parent class for creating and deploying web services for your models. For a detailed guide on preparing for model deployment and deploying web services, see this how-to.
You can use environments when you deploy your model as a web service. Environments enable a reproducible, connected workflow where you can deploy your model using the same libraries in both your training compute and your inference compute. Internally, environments are implemented as Docker images. You can use either images provided by Microsoft, or use your own custom Docker images. If you were previously using the class for your deployment, see the class for accomplishing a similar workflow with environments.
To deploy a web service, combine the environment, inference compute, scoring script, and registered model in your deployment object, .
The following example, assumes you already completed a training run using environment, , and want to deploy that model to Azure Container Instances.
This example creates an Azure Container Instances web service, which is best for small-scale testing and quick deployments. To deploy your model as a production-scale web service, use Azure Kubernetes Service (AKS). For more information, see AksCompute class.
The class is a foundational resource for exploring and managing data within Azure Machine Learning. You can explore your data with summary statistics, and save the Dataset to your AML workspace to get versioning and reproducibility capabilities. Datasets are easily consumed by models during training. For detailed usage examples, see the how-to guide.
- represents data in a tabular format created by parsing a file or list of files.
- references single or multiple files in datastores or from public URLs.
The following example shows how to create a TabularDataset pointing to a single path in a datastore.
The following example shows how to create a referencing multiple file URLs.
Try these next steps to learn how to use the Azure Machine Learning SDK for Python:
Follow the tutorial to learn how to build, train, and deploy a model in Python.
Look up classes and modules in the reference documentation on this site by using the table of contents on the left.
Azure Machine Learning Service — Run a Simple Experiment
Once connected to the Azure Machine Learning workspace, the next step is to define a model experiment script, this is a very general python script which is loaded and executed form the experiment’s ‘workspace’ compute context. This also requires that the data files’ root (folder) has to be at the same location where this script is loaded, as shown below —
Experiment Output Folder — Mostly, Run of an experiment generates some output files, e.g. a saved model, that are saved in the workspace’s outputs folder as shown below —os.makedirs("outputs", exist_ok=True)
# Complete the run
One can also use the Run object’s upload_file method in the experiment file code as shown below, this enables that any files written to the outputs folder in the compute context are automatically uploaded to the run’s outputs folder when the run completed, e.g. —run.upload_file(name='outputs/IRIS.csv', path_or_stream='./sample.csv') # upload from local
RunConfiguration and ScriptRunConfig
After the script for implementing the ML model is ready, the next step is to define object — which defines the Python environment in which the script will run, and object— which associates the run environment with the script. The below code snippet instantiates the Python environment by calling the method and encapsulate the same environment for the script’s execution:from azureml.core import Experiment, RunConfiguration, ScriptRunConfig# create a new RunConfig object
experiment_run_config = RunConfiguration()
experiment_run_config.environment.python.user_managed_dependencies = True# Create a script config
src = ScriptRunConfig(source_directory=experiment_folder,
The object also allows to additionally include the Python packages which are necessary for script execution.
Access the Azure Machine Learning Studio to navigate for the sample experiment run and verify the results in results on the dashboard. Here, the entire details of the experiment’s run history is displayed, as shown below — the details of run stats, history, results, metrics, logs, outputs, errors, diagnostics, etc.. are readily available from the dashboard:
In this part of the series, I tried to cover the most fundamental concept of Azure Machine Learning Service, i.e., to prepare and execute a machine learning experiment and generate a model binary. In the next article, I will cover a slightly more advanced way to set up and control the script’s execution environment, install or update the required dependencies & packages. So please stay tuned!
This is a series of blog posts encompassing a detailed overview of various Azure Machine Learning capabilities, the URLs for other posts are as follows:
Connect with me on LinkedIn to discuss further
 Notebook & Code — Azure Machine Learning — Introduction, Kaggle.
 Azure Machine Learning Service Official Documentation, Microsoft Azure.
- Evaporator coil manufacturer
- 60605 apartments
- 2015 forester manual
- Harley spoke covers
- Witch deity quiz
- Weekly task schedule
- Youtube huapango
- 60 ansi lumens
- Prefix 254
- Loaders caterpillar
Without hesitation or shame, I took off my. T-shirt. The guys were stunned by my agility and showering my breasts with compliments, climbed to paw her. Or rather, only two people did it, except the driver.