

# Autopilot model deployment and prediction
Model deployment and prediction

This Amazon SageMaker Autopilot guide includes steps for model deployment, setting up real-time inference, and running inference with batch jobs. 

After you train your Autopilot models, you can deploy them to get predictions in one of two ways:

1. Use [Deploy models for real-time inference](autopilot-deploy-models-realtime.md) to set up an endpoint and obtain predictions interactively. Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements.

1. Use [Run batch inference jobs](autopilot-deploy-models-batch.md) to make predictions in parallel on batches of observations on an entire dataset. Batch inference is a good option for large datasets or if you don't need an immediate response to a model prediction request.

**Note**  
To avoid incurring unnecessary charges: After the endpoints and resources that were created from model deployment are no longer needed, you can delete them. For information about pricing of instances by Region, see [Amazon SageMaker Pricing](https://aws.amazon.com/sagemaker/pricing/).

# Deploy models for real-time inference


Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements. This section shows how you can use real-time inferencing to obtain predictions interactively from your model.

To deploy the model that produced the best validation metric in an Autopilot experiment, you have several options. For example, when using Autopilot in SageMaker Studio Classic, you can deploy the model automatically or manually. You can also use SageMaker APIs to manually deploy an Autopilot model. 

The following tabs show three options for deploying your model. These instructions assume that you have already created a model in Autopilot. If you don't have a model, see [Create Regression or Classification Jobs for Tabular Data Using the AutoML API](autopilot-automate-model-development-create-experiment.md). To see examples for each option, open each tab.

## Deploy using the Autopilot User Interface (UI)


The Autopilot UI contains helpful dropdown menus, toggles, tooltips, and more to help you navigate through model deployment. You can deploy using either one of the following procedures: Automatic or Manual.
+ **Automatic Deployment**: To automatically deploy the best model from an Autopilot experiment to an endpoint

  1. [Create an experiment](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development-create-experiment.html) in SageMaker Studio Classic. 

  1. Toggle the **Auto deploy** value to **Yes**.
**Note**  
**Automatic deployment will fail if either the default resource quota or your customer quota for endpoint instances in a Region is too limited.** In hyperparameter optimization (HPO) mode, you are required to have at least two ml.m5.2xlarge instances. In ensembling mode, you are required to have at least one ml.m5.12xlarge instance. If you encounter a failure related to quotas, you can [request a service limit increase](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html) for SageMaker AI endpoint instances.
+ **Manual Deployment**: To manually deploy the best model from an Autopilot experiment to an endpoint

  1. [Create an experiment](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development-create-experiment.html) in SageMaker Studio Classic. 

  1. Toggle the **Auto deploy** value to **No**. 

  1. Select the model that you want to deploy under **Model name**.

  1. Select the orange **Deployment and advanced settings** button located on the right of the leaderboard. This opens a new tab.

  1. Configure the endpoint name, instance type, and other optional information.

  1.  Select the orange **Deploy model** to deploy to an endpoint.

  1. Check the progress of the endpoint creation process in the [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/) by navigating to the Endpoints section. That section is located in the **Inference** dropdown menu in the navigation panel. 

  1. After the endpoint status changes from **Creating** to **InService**, as shown below, return to Studio Classic and invoke the endpoint.  
![\[SageMaker AI console: Endpoints page to create an endpoint or check endpoint status.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/autopilot/autopilot-check-progress.PNG)

## Deploy using SageMaker APIs


You can also obtain real-time inference by deploying your model using **API calls**. This section shows the five steps of this process using AWS Command Line Interface (AWS CLI) code snippets. 

For complete code examples for both AWS CLI commands and AWS SDK for Python (boto3), open the tabs directly following these steps.

1. **Obtain candidate definitions**

   Obtain the candidate container definitions from [InferenceContainers](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLCandidate.html#sagemaker-Type-AutoMLCandidate-InferenceContainers). These candidate definitions are used to create a SageMaker AI model. 

   The following example uses the [DescribeAutoMLJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeAutoMLJob.html) API to obtain candidate definitions for the best model candidate. See the following AWS CLI command as an example.

   ```
   aws sagemaker describe-auto-ml-job --auto-ml-job-name <job-name> --region <region>
   ```

1. **List candidates**

   The following example uses the [ListCandidatesForAutoMLJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ListCandidatesForAutoMLJob.html) API to list all candidates. See the following AWS CLI command as an example.

   ```
   aws sagemaker list-candidates-for-auto-ml-job --auto-ml-job-name <job-name> --region <region>
   ```

1. **Create a SageMaker AI model**

   Use the container definitions from the previous steps to create a SageMaker AI model by using the [CreateModel](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) API. See the following AWS CLI command as an example.

   ```
   aws sagemaker create-model --model-name '<your-custom-model-name>' \
                       --containers ['<container-definition1>, <container-definition2>, <container-definition3>]' \
                       --execution-role-arn '<execution-role-arn>' --region '<region>
   ```

1. **Create an endpoint configuration** 

   The following example uses the [CreateEndpointConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html) API to create an endpoint configuration. See the following AWS CLI command as an example.

   ```
   aws sagemaker create-endpoint-config --endpoint-config-name '<your-custom-endpoint-config-name>' \
                       --production-variants '<list-of-production-variants>' \
                       --region '<region>'
   ```

1. **Create the endpoint** 

   The following AWS CLI example uses the [CreateEndpoint](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html) API to create the endpoint.

   ```
   aws sagemaker create-endpoint --endpoint-name '<your-custom-endpoint-name>' \
                       --endpoint-config-name '<endpoint-config-name-you-just-created>' \
                       --region '<region>'
   ```

   Check the progress of your endpoint deployment by using the [DescribeEndpoint](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeEndpoint.html) API. See the following AWS CLI command as an example.

   ```
   aws sagemaker describe-endpoint —endpoint-name '<endpoint-name>' —region <region>
   ```

   After the `EndpointStatus` changes to `InService`, the endpoint is ready to use for real-time inference.

1. **Invoke the endpoint** 

   The following command structure invokes the endpoint for real-time inferencing.

   ```
   aws sagemaker invoke-endpoint --endpoint-name '<endpoint-name>' \ 
                     --region '<region>' --body '<your-data>' [--content-type] '<content-type>' <outfile>
   ```

The following tabs contain complete code examples for deploying a model with AWS SDK for Python (boto3) or the AWS CLI.

------
#### [ AWS SDK for Python (boto3) ]

1. **Obtain the candidate definitions** by using the following code example.

   ```
   import sagemaker 
   import boto3
   
   session = sagemaker.session.Session()
   
   sagemaker_client = boto3.client('sagemaker', region_name='us-west-2')
   job_name = 'test-auto-ml-job'
   
   describe_response = sm_client.describe_auto_ml_job(AutoMLJobName=job_name)
   # extract the best candidate definition from DescribeAutoMLJob response
   best_candidate = describe_response['BestCandidate']
   # extract the InferenceContainers definition from the caandidate definition
   inference_containers = best_candidate['InferenceContainers']
   ```

1. **Create the model** by using the following the code example.

   ```
   # Create Model
   model_name = 'test-model' 
   sagemaker_role = 'arn:aws:iam:444455556666:role/sagemaker-execution-role'
   create_model_response = sagemaker_client.create_model(
      ModelName = model_name,
      ExecutionRoleArn = sagemaker_role,
      Containers = inference_containers 
   )
   ```

1. **Create the endpoint configuration** by using the following the code example.

   ```
   endpoint_config_name = 'test-endpoint-config'
                                                           
   instance_type = 'ml.m5.2xlarge' 
   # for all supported instance types, see 
   # https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ProductionVariant.html#sagemaker-Type-ProductionVariant-InstanceType    # Create endpoint config
   
   endpoint_config_response = sagemaker_client.create_endpoint_config(
      EndpointConfigName=endpoint_config_name, 
      ProductionVariants=[
          {
              "VariantName": "variant1",
              "ModelName": model_name, 
              "InstanceType": instance_type,
              "InitialInstanceCount": 1
          }
      ]
   )
   
   print(f"Created EndpointConfig: {endpoint_config_response['EndpointConfigArn']}")
   ```

1. **Create the endpoint** and deploy the model with the following code example.

   ```
   # create endpoint and deploy the model
   endpoint_name = 'test-endpoint'
   create_endpoint_response = sagemaker_client.create_endpoint(
                                               EndpointName=endpoint_name, 
                                               EndpointConfigName=endpoint_config_name)
   print(create_endpoint_response)
   ```

   **Check the status of creating the endpoint** by using the following the code example.

   ```
   # describe endpoint creation status
   status = sagemaker_client.describe_endpoint(EndpointName=endpoint_name)["EndpointStatus"]
   ```

1. **Invoke the endpoint** for real-time inferencing by using the following command structure.

   ```
   # once endpoint status is InService, you can invoke the endpoint for inferencing
   if status == "InService":
     sm_runtime = boto3.Session().client('sagemaker-runtime')
     inference_result = sm_runtime.invoke_endpoint(EndpointName='test-endpoint', ContentType='text/csv', Body='1,2,3,4,class')
   ```

------
#### [ AWS Command Line Interface (AWS CLI) ]

1. **Obtain the candidate definitions** by using the following code example.

   ```
   aws sagemaker describe-auto-ml-job --auto-ml-job-name 'test-automl-job' --region us-west-2
   ```

1. **Create the model** by using the following code example.

   ```
   aws sagemaker create-model --model-name 'test-sagemaker-model'
   --containers '[{
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-sklearn-automl:2.5-1-cpu-py3", amzn-s3-demo-bucket1
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/output/model.tar.gz",
       "Environment": {
           "AUTOML_SPARSE_ENCODE_RECORDIO_PROTOBUF": "1",
           "AUTOML_TRANSFORM_MODE": "feature-transform",
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "application/x-recordio-protobuf",
           "SAGEMAKER_PROGRAM": "sagemaker_serve",
           "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code"
       }
   }, {
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-xgboost:1.3-1-cpu-py3",
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/output/model.tar.gz",
       "Environment": {
           "MAX_CONTENT_LENGTH": "20971520",
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "text/csv",
           "SAGEMAKER_INFERENCE_OUTPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_SUPPORTED": "predicted_label,probability,probabilities" 
       }
   }, {
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-sklearn-automl:2.5-1-cpu-py3", aws-region
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/output/model.tar.gz", 
       "Environment": { 
           "AUTOML_TRANSFORM_MODE": "inverse-label-transform", 
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "text/csv", 
           "SAGEMAKER_INFERENCE_INPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_OUTPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_SUPPORTED": "predicted_label,probability,labels,probabilities", 
           "SAGEMAKER_PROGRAM": "sagemaker_serve", 
           "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code"
       } 
   }]' \
   --execution-role-arn 'arn:aws:iam::1234567890:role/sagemaker-execution-role' \ 
   --region 'us-west-2'
   ```

   For additional details, see [creating a model](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/sagemaker/create-model.html).

   The `create model` command will return a response in the following format.

   ```
   {
       "ModelArn": "arn:aws:sagemaker:us-west-2:1234567890:model/test-sagemaker-model"
   }
   ```

1. **Create an endpoint configuration** by using the following code example.

   ```
   aws sagemaker create-endpoint-config --endpoint-config-name 'test-endpoint-config' \
   --production-variants '[{"VariantName": "variant1", 
                           "ModelName": "test-sagemaker-model",
                           "InitialInstanceCount": 1,
                           "InstanceType": "ml.m5.2xlarge"
                          }]' \
   --region us-west-2
   ```

   The `create endpoint` configuration command will return a response in the following format.

   ```
   {
       "EndpointConfigArn": "arn:aws:sagemaker:us-west-2:1234567890:endpoint-config/test-endpoint-config"
   }
   ```

1. **Create an endpoint** by using the following code example.

   ```
   aws sagemaker create-endpoint --endpoint-name 'test-endpoint' \    
   --endpoint-config-name 'test-endpoint-config' \                 
   --region us-west-2
   ```

   The `create endpoint` command will return a response in the following format.

   ```
   {
       "EndpointArn": "arn:aws:sagemaker:us-west-2:1234567890:endpoint/test-endpoint"
   }
   ```

   Check the progress of the endpoint deployment by using the following [describe-endpoint](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/describe-endpoint.html) CLI code example.

   ```
   aws sagemaker describe-endpoint --endpoint-name 'test-endpoint' --region us-west-2
   ```

   The previous progress check will return a response in the following format.

   ```
   {
       "EndpointName": "test-endpoint",
       "EndpointArn": "arn:aws:sagemaker:us-west-2:1234567890:endpoint/test-endpoint",
       "EndpointConfigName": "test-endpoint-config",
       "EndpointStatus": "Creating",
       "CreationTime": 1660251167.595,
       "LastModifiedTime": 1660251167.595
   }
   ```

   After the `EndpointStatus` changes to `InService`, the endpoint is ready for use in real-time inference.

1. **Invoke the endpoint** for real-time inferencing by using the following command structure.

   ```
   aws sagemaker-runtime invoke-endpoint --endpoint-name 'test-endpoint' \
   --region 'us-west-2' \
   --body '1,51,3.5,1.4,0.2' \
   --content-type 'text/csv' \
   '/tmp/inference_output'
   ```

   For more options, see [invoking an endpoint](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/sagemaker-runtime/invoke-endpoint.html).

------

## Deploy models from different accounts


You can deploy an Autopilot model from a different account than the original account that a model was generated in. To implement cross-account model deployment, this section shows how to do the following:   Grant permission to assume the role to the account you want to deploy from (the generating account).    Make a call to `DescribeAutoMLJob` from the deploying account to obtain model information.    Grant access rights to the model artifacts from the generating account.    

1. **Grant permission to the deploying account** 

   To assume the role in the generating account, you must grant permission to the deploying account. This allows the deploying account to describe Autopilot jobs in the generating account.

   The following example uses a generating account with a trusted `sagemaker-role` entity. The example shows how to give a deploying account with the ID 111122223333 permission to assume the role of the generating account.

   ```
   "Statement": [
           {
               "Effect": "Allow",
               "Principal": {
                   "Service": [
                       "sagemaker.amazonaws.com"
                   ],
                   "AWS": [ "111122223333"]
               },
               "Action": "sts:AssumeRole"
           }
   ```

   The new account with the ID 111122223333 can now assume the role for the generating account. 

   Next, call the `DescribeAutoMLJob` API from the deploying account to obtain a description of the job created by the generating account. 

   The following code example describes the model from the deploying account.

   ```
   import sagemaker 
   import boto3
   session = sagemaker.session.Session()
   
   sts_client = boto3.client('sts')
   sts_client.assume_role
   
   role = 'arn:aws:iam::111122223333:role/sagemaker-role'
   role_session_name = "role-session-name"
   _assumed_role = sts_client.assume_role(RoleArn=role, RoleSessionName=role_session_name)
   
   credentials = _assumed_role["Credentials"]
   access_key = credentials["AccessKeyId"]
   secret_key = credentials["SecretAccessKey"]
   session_token = credentials["SessionToken"]
   
   session = boto3.session.Session()
           
   sm_client = session.client('sagemaker', region_name='us-west-2', 
                              aws_access_key_id=access_key,
                               aws_secret_access_key=secret_key,
                               aws_session_token=session_token)
   
   # now you can call describe automl job created in account A 
   
   job_name = "test-job"
   response= sm_client.describe_auto_ml_job(AutoMLJobName=job_name)
   ```

1. **Grant access to the deploying account** to the model artifacts in the generating account.

   The deploying account only needs access to the model artifacts in the generating account to deploy it. These are located in the [S3OutputPath](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLOutputDataConfig.html#sagemaker-Type-AutoMLOutputDataConfig-S3OutputPath) that was specified in the original `CreateAutoMLJob` API call during model generation.

   To give the deploying account access to the model artifacts, choose one of the following options:

   1. [Give access](https://aws.amazon.com/premiumsupport/knowledge-center/cross-account-access-s3/) to the `ModelDataUrl` from the generating account to the deploying account.

      Next, you need to give the deploying account permission to assume the role. follow the [real-time inferencing steps](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-deploy-models.html#autopilot-deploy-models-realtime) to deploy. 

   1. [Copy model artifacts](https://aws.amazon.com/premiumsupport/knowledge-center/copy-s3-objects-account/) from the generating account's original [S3OutputPath](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLOutputDataConfig.html#sagemaker-Type-AutoMLOutputDataConfig-S3OutputPath) to the generating account.

      To grant access to the model artifacts, you must define a `best_candidate` model and reassign model containers to the new account. 

      The following example shows how to define a `best_candidate` model and reassign the `ModelDataUrl`.

      ```
      best_candidate = automl.describe_auto_ml_job()['BestCandidate']
      
      # reassigning ModelDataUrl for best_candidate containers below
      new_model_locations = ['new-container-1-ModelDataUrl', 'new-container-2-ModelDataUrl', 'new-container-3-ModelDataUrl']
      new_model_locations_index = 0
      for container in best_candidate['InferenceContainers']:
          container['ModelDataUrl'] = new_model_locations[new_model_locations_index++]
      ```

      After this assignment of containers, follow the steps in [Deploy using SageMaker APIs](#autopilot-deploy-models-api) to deploy.

To build a payload in real-time inferencing, see the notebook example to [ define a test payload](https://aws.amazon.com/getting-started/hands-on/machine-learning-tutorial-automatically-create-models). To create the payload from a CSV file and invoke an endpoint, see the **Predict with your model** section in [Create a machine learning model automatically](https://aws.amazon.com/getting-started/hands-on/create-machine-learning-model-automatically-sagemaker-autopilot/#autopilot-cr-room).

# Run batch inference jobs


Batch inferencing, also known as offline inferencing, generates model predictions on a batch of observations. Batch inference is a good option for large datasets or if you don't need an immediate response to a model prediction request. By contrast, online inference ([real-time inferencing](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-deploy-models.html#autopilot-deploy-models-realtime)) generates predictions in real time. You can make batch inferences from an Autopilot model using the [SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/), the Autopilot user interface (UI), the [AWS SDK for Python (boto3)](https://aws.amazon.com/sdk-for-python/), or the AWS Command Line Interface ([AWS CLI](https://docs.aws.amazon.com/cli/)).

The following tabs show three options for deploying your model: Using APIs, Autopilot UI, or using APIs to deploy from different accounts. These instructions assume that you have already created a model in Autopilot. If you don't have a model, see [Create Regression or Classification Jobs for Tabular Data Using the AutoML API](autopilot-automate-model-development-create-experiment.md). To see examples for each option, open each tab.

## Deploy a model using Autopilot UI


The Autopilot UI contains helpful dropdown menus, toggles, tooltips, and more to help you navigate through model deployment.

The following steps show how to deploy a model from an Autopilot experiment for batch predictions. 

1. Sign in at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/) and select **Studio** from the navigation pane.

1. On the left navigation pane, choose **Studio**.

1. Under **Get started**, select the Domain that you want to launch the Studio application in. If your user profile only belongs to one Domain, you do not see the option for selecting a Domain.

1. Select the user profile that you want to launch the Studio Classic application for. If there is no user profile in the domain, choose **Create user profile**. For more information, see [Add user profiles](https://docs.aws.amazon.com/sagemaker/latest/dg/domain-user-profile-add.html).

1. Choose **Launch Studio**. If the user profile belongs to a shared space, choose **Open Spaces**. 

1. When the SageMaker Studio Classic console opens, choose the **Launch SageMaker Studio** button.

1. Select **AutoML** from the left navigation pane.

1. Under **Name**, select the Autopilot experiment corresponding to the model that you want to deploy. This opens a new **AUTOPILOT JOB** tab.

1. In the **Model name** section, select the model that you want to deploy.

1. Choose **Deploy model**. This opens a new tab.

1. Choose **Make batch predictions** at the top of the page.

1. For **Batch transform job configuration**, input the **Instance type**, **Instance count** and other optional information.

1. In the **Input data configuration** section, open the dropdown menu. 

   1. For **S3 data type**, choose **ManifestFile** or **S3Prefix**.

   1. For **Split type**, choose **Line**, **RecordIO**, **TFRecord** or **None**.

   1. For **Compression**, choose **Gzip** or **None**. 

1. For **S3 location**, enter the Amazon S3 bucket location of the input data and other optional information.

1. Under **Output data configuration**, enter the S3 bucket for the output data, and choose how to [assemble the output](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_TransformOutput.html#sagemaker-Type-TransformOutput-AssembleWith) of your job. 

   1. For **Additional configuration (optional)**, you can enter a MIME type and an **S3 Encryption key**.

1. For **Input/output filtering and data joins (optional)**, you enter a JSONpath expression to filter your input data, join the input source data with your output data, and enter a JSONpath expression to filter your output data. 

   1. For examples for each type of filter, see the [DataProcessing API](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DataProcessing.html#sagemaker-Type-DataProcessing-InputFilter).

1. To perform batch predictions on your input dataset, select **Create batch transform job**. A new **Batch Transform Jobs** tab appears.

1. In the **Batch Transform Jobs** tab: Locate the name of your job in **Status** section. Then check the progress of the job. 

## Deploy using SageMaker APIs


To use the SageMaker APIs for batch inferencing, there are three steps:

1. **Obtain candidate definitions** 

   Candidate definitions from [InferenceContainers](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLCandidate.html#sagemaker-Type-AutoMLCandidate-InferenceContainers) are used to create a SageMaker AI model. 

   The following example shows how to use the [DescribeAutoMLJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeAutoMLJob.html) API to obtain candidate definitions for the best model candidate. See the following AWS CLI command as an example.

   ```
   aws sagemaker describe-auto-ml-job --auto-ml-job-name <job-name> --region <region>
   ```

   Use the [ListCandidatesForAutoMLJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ListCandidatesForAutoMLJob.html) API to list all candidates. See the following AWS CLI command as an example.

   ```
   aws sagemaker list-candidates-for-auto-ml-job --auto-ml-job-name <job-name> --region <region>
   ```

1. **Create a SageMaker AI model**

   To create a SageMaker AI model using the [CreateModel](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModel.html) API, use the container definitions from the previous steps. See the following AWS CLI command as an example.

   ```
   aws sagemaker create-model --model-name '<your-custom-model-name>' \
                       --containers ['<container-definition1>, <container-definition2>, <container-definition3>]' \
                       --execution-role-arn '<execution-role-arn>' --region '<region>
   ```

1. **Create a SageMaker AI transform job** 

   The following example creates a SageMaker AI transform job with the [CreateTransformJob](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-transform-job.html) API. See the following AWS CLI command as an example.

   ```
   aws sagemaker create-transform-job --transform-job-name '<your-custom-transform-job-name>' --model-name '<your-custom-model-name-from-last-step>'\
   --transform-input '{
           "DataSource": {
               "S3DataSource": {
                   "S3DataType": "S3Prefix", 
                   "S3Uri": "<your-input-data>" 
               }
           },
           "ContentType": "text/csv",
           "SplitType": "Line"
       }'\
   --transform-output '{
           "S3OutputPath": "<your-output-path>",
           "AssembleWith": "Line" 
       }'\
   --transform-resources '{
           "InstanceType": "<instance-type>", 
           "InstanceCount": 1
       }' --region '<region>'
   ```

Check the progress of your transform job using the [DescribeTransformJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeTransformJob.html) API. See the following AWS CLI command as an example.

```
aws sagemaker describe-transform-job --transform-job-name '<your-custom-transform-job-name>' --region <region>
```

After the job is finished, the predicted result will be available in `<your-output-path>`. 

The output file name has the following format: `<input_data_file_name>.out`. As an example, if your input file is `text_x.csv`, the output name will be `text_x.csv.out`.

The following tabs show code examples for SageMaker Python SDK, AWS SDK for Python (boto3), and the AWS CLI.

------
#### [ SageMaker Python SDK ]

The following example uses the **[SageMaker Python SDK ](https://sagemaker.readthedocs.io/en/stable/overview.html)** to make predictions in batches.

```
from sagemaker import AutoML

sagemaker_session= sagemaker.session.Session()

job_name = 'test-auto-ml-job' # your autopilot job name
automl = AutoML.attach(auto_ml_job_name=job_name)
output_path = 's3://test-auto-ml-job/output'
input_data = 's3://test-auto-ml-job/test_X.csv'

# call DescribeAutoMLJob API to get the best candidate definition
best_candidate = automl.describe_auto_ml_job()['BestCandidate']
best_candidate_name = best_candidate['CandidateName']

# create model
model = automl.create_model(name=best_candidate_name, 
               candidate=best_candidate)

# create transformer
transformer = model.transformer(instance_count=1, 
    instance_type='ml.m5.2xlarge',
    assemble_with='Line',
    output_path=output_path)

# do batch transform
transformer.transform(data=input_data,
                      split_type='Line',
                       content_type='text/csv',
                       wait=True)
```

------
#### [ AWS SDK for Python (boto3) ]

 The following example uses **AWS SDK for Python (boto3)** to make predictions in batches.

```
import sagemaker 
import boto3

session = sagemaker.session.Session()

sm_client = boto3.client('sagemaker', region_name='us-west-2')
role = 'arn:aws:iam::1234567890:role/sagemaker-execution-role'
output_path = 's3://test-auto-ml-job/output'
input_data = 's3://test-auto-ml-job/test_X.csv'

best_candidate = sm_client.describe_auto_ml_job(AutoMLJobName=job_name)['BestCandidate']
best_candidate_containers = best_candidate['InferenceContainers']
best_candidate_name = best_candidate['CandidateName']

# create model
reponse = sm_client.create_model(
    ModelName = best_candidate_name,
    ExecutionRoleArn = role,
    Containers = best_candidate_containers 
)

# Lauch Transform Job
response = sm_client.create_transform_job(
    TransformJobName=f'{best_candidate_name}-transform-job',
    ModelName=model_name,
    TransformInput={
        'DataSource': {
            'S3DataSource': {
                'S3DataType': 'S3Prefix',
                'S3Uri': input_data
            }
        },
        'ContentType': "text/csv",
        'SplitType': 'Line'
    },
    TransformOutput={
        'S3OutputPath': output_path,
        'AssembleWith': 'Line',
    },
    TransformResources={
        'InstanceType': 'ml.m5.2xlarge',
        'InstanceCount': 1,
    },
)
```

The batch inference job returns a response in the following format.

```
{'TransformJobArn': 'arn:aws:sagemaker:us-west-2:1234567890:transform-job/test-transform-job',
 'ResponseMetadata': {'RequestId': '659f97fc-28c4-440b-b957-a49733f7c2f2',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '659f97fc-28c4-440b-b957-a49733f7c2f2',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '96',
   'date': 'Thu, 11 Aug 2022 22:23:49 GMT'},
  'RetryAttempts': 0}}
```

------
#### [ AWS Command Line Interface (AWS CLI) ]

1. **Obtain the candidate definitions** by using the following the code example.

   ```
   aws sagemaker describe-auto-ml-job --auto-ml-job-name 'test-automl-job' --region us-west-2
   ```

1. **Create the model** by using the following the code example.

   ```
   aws sagemaker create-model --model-name 'test-sagemaker-model'
   --containers '[{
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-sklearn-automl:2.5-1-cpu-py3",
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/out/test-job1/data-processor-models/test-job1-dpp0-1-e569ff7ad77f4e55a7e549a/output/model.tar.gz",
       "Environment": {
           "AUTOML_SPARSE_ENCODE_RECORDIO_PROTOBUF": "1",
           "AUTOML_TRANSFORM_MODE": "feature-transform",
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "application/x-recordio-protobuf",
           "SAGEMAKER_PROGRAM": "sagemaker_serve",
           "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code"
       }
   }, {
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-xgboost:1.3-1-cpu-py3",
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/out/test-job1/tuning/flicdf10v2-dpp0-xgb/test-job1E9-244-7490a1c0/output/model.tar.gz",
       "Environment": {
           "MAX_CONTENT_LENGTH": "20971520",
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "text/csv",
           "SAGEMAKER_INFERENCE_OUTPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_SUPPORTED": "predicted_label,probability,probabilities" 
       }
   }, {
       "Image": "348316444620.dkr.ecr.us-west-2.amazonaws.com/sagemaker-sklearn-automl:2.5-1-cpu-py3", 
       "ModelDataUrl": "s3://amzn-s3-demo-bucket/out/test-job1/data-processor-models/test-job1-dpp0-1-e569ff7ad77f4e55a7e549a/output/model.tar.gz", 
       "Environment": { 
           "AUTOML_TRANSFORM_MODE": "inverse-label-transform", 
           "SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT": "text/csv", 
           "SAGEMAKER_INFERENCE_INPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_OUTPUT": "predicted_label", 
           "SAGEMAKER_INFERENCE_SUPPORTED": "predicted_label,probability,labels,probabilities", 
           "SAGEMAKER_PROGRAM": "sagemaker_serve", 
           "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code" 
       } 
   }]' \
   --execution-role-arn 'arn:aws:iam::1234567890:role/sagemaker-execution-role' \
   --region 'us-west-2'
   ```

1. **Create the transform job** by using the following the code example.

   ```
   aws sagemaker create-transform-job --transform-job-name 'test-tranform-job'\
    --model-name 'test-sagemaker-model'\
   --transform-input '{
           "DataSource": {
               "S3DataSource": {
                   "S3DataType": "S3Prefix",
                   "S3Uri": "s3://amzn-s3-demo-bucket/data.csv"
               }
           },
           "ContentType": "text/csv",
           "SplitType": "Line"
       }'\
   --transform-output '{
           "S3OutputPath": "s3://amzn-s3-demo-bucket/output/",
           "AssembleWith": "Line"
       }'\
   --transform-resources '{
           "InstanceType": "ml.m5.2xlarge",
           "InstanceCount": 1
       }'\
   --region 'us-west-2'
   ```

1. **Check the progress of the transform job** by using the following the code example. 

   ```
   aws sagemaker describe-transform-job --transform-job-name  'test-tranform-job' --region us-west-2
   ```

   The following is the response from the transform job.

   ```
   {
       "TransformJobName": "test-tranform-job",
       "TransformJobArn": "arn:aws:sagemaker:us-west-2:1234567890:transform-job/test-tranform-job",
       "TransformJobStatus": "InProgress",
       "ModelName": "test-model",
       "TransformInput": {
           "DataSource": {
               "S3DataSource": {
                   "S3DataType": "S3Prefix",
                   "S3Uri": "s3://amzn-s3-demo-bucket/data.csv"
               }
           },
           "ContentType": "text/csv",
           "CompressionType": "None",
           "SplitType": "Line"
       },
       "TransformOutput": {
           "S3OutputPath": "s3://amzn-s3-demo-bucket/output/",
           "AssembleWith": "Line",
           "KmsKeyId": ""
       },
       "TransformResources": {
           "InstanceType": "ml.m5.2xlarge",
           "InstanceCount": 1
       },
       "CreationTime": 1662495635.679,
       "TransformStartTime": 1662495847.496,
       "DataProcessing": {
           "InputFilter": "$",
           "OutputFilter": "$",
           "JoinSource": "None"
       }
   }
   ```

   After the `TransformJobStatus` changes to `Completed`, you can check the inference result in the `S3OutputPath`.

------

## Deploy models from different accounts


To create a batch inferencing job in a different account than the one that the model was generated in, follow the instructions in [Deploy models from different accounts](autopilot-deploy-models-realtime.md#autopilot-deploy-models-realtime-across-accounts). Then you can create models and transform jobs by following the [Deploy using SageMaker APIs](#autopilot-deploy-models-batch-steps).