# Data and model quality monitoring with Amazon SageMaker Model Monitor
<a name="model-monitor"></a>

Amazon SageMaker Model Monitor monitors the quality of Amazon SageMaker AI machine learning models in production. With Model Monitor, you can set up:
+ Continuous monitoring with a real-time endpoint.
+ Continuous monitoring with a batch transform job that runs regularly.
+ On-schedule monitoring for asynchronous batch transform jobs.

With Model Monitor, you can set alerts that notify you when there are deviations in the model quality. Early and proactive detection of these deviations lets you to take corrective actions. You can take actions like retraining models, auditing upstream systems, or fixing quality issues without having to monitor models manually or build additional tooling. You can use Model Monitor prebuilt monitoring capabilities that do not require coding. You also have the flexibility to monitor models by coding to provide custom analysis.

Model Monitor provides the following types of monitoring:
+ [Data quality](model-monitor-data-quality.md) - Monitor drift in data quality.
+ [Model quality](model-monitor-model-quality.md) - Monitor drift in model quality metrics, such as accuracy.
+ [Bias drift for models in production](clarify-model-monitor-bias-drift.md) - Monitor bias in your model's predictions.
+ [Feature attribution drift for models in production](clarify-model-monitor-feature-attribution-drift.md) - Monitor drift in feature attribution.

**Topics**
+ [Monitoring a Model in Production](how-it-works-model-monitor.md)
+ [How Amazon SageMaker Model Monitor works](#model-monitor-how-it-works)
+ [Data capture](model-monitor-data-capture.md)
+ [Data quality](model-monitor-data-quality.md)
+ [Model quality](model-monitor-model-quality.md)
+ [Bias drift for models in production](clarify-model-monitor-bias-drift.md)
+ [Feature attribution drift for models in production](clarify-model-monitor-feature-attribution-drift.md)
+ [Schedule monitoring jobs](model-monitor-scheduling.md)
+ [Amazon SageMaker Model Monitor prebuilt container](model-monitor-pre-built-container.md)
+ [Interpret results](model-monitor-interpreting-results.md)
+ [Visualize results for real-time endpoints in Amazon SageMaker Studio](model-monitor-interpreting-visualize-results.md)
+ [Advanced topics](model-monitor-advanced-topics.md)
+ [Model Monitor FAQs](model-monitor-faqs.md)

# Monitoring a Model in Production
<a name="how-it-works-model-monitor"></a>

After you deploy a model into your production environment, use Amazon SageMaker Model Monitor to continuously monitor the quality of your machine learning models in real time. Amazon SageMaker Model Monitor enables you to set up an automated alert triggering system when there are deviations in the model quality, such as data drift and anomalies. Amazon CloudWatch Logs collects log files of monitoring the model status and notifies when the quality of your model hits certain thresholds that you preset. CloudWatch stores the log files to an Amazon S3 bucket you specify. Early and pro-active detection of model deviations through AWS model monitor products enables you to take prompt actions to maintain and improve the quality of your deployed model. 

For more information about SageMaker Model Monitoring products, see [Data and model quality monitoring with Amazon SageMaker Model Monitor](model-monitor.md).

To start your machine learning journey with SageMaker AI, sign up for an AWS account at [ Set Up SageMaker AI](https://docs.aws.amazon.com/sagemaker/latest/dg/gs-set-up.html). 

## How Amazon SageMaker Model Monitor works
<a name="model-monitor-how-it-works"></a>

Amazon SageMaker Model Monitor automatically monitors machine learning (ML) models in production and notifies you when quality issues happen. Model Monitor uses rules to detect drift in your models and alerts you when it happens. The following figure shows how this process works in the case that your model is deployed to a real-time endpoint.

![\[The model monitoring process with Amazon SageMaker Model Monitor.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mmv2-architecture.png)


You can also use Model Monitor to monitor a batch transform job instead of a real-time endpoint. In this case, instead of receiving requests to an endpoint and tracking the predictions, Model Monitor monitors inference inputs and outputs. The following figure diagrams the process of monitoring a batch transform job.

![\[The model monitoring process with Amazon SageMaker Model Monitor.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mmv2-architecture-batch.png)


To enable model monitoring, take the following steps. These steps follow the path of the data through the various data collection, monitoring, and analysis processes.
+ For a real-time endpoint, enable the endpoint to capture data from incoming requests to a trained ML model and the resulting model predictions.
+ For a batch transform job, enable data capture of the batch transform inputs and outputs.
+ Create a baseline from the dataset that was used to train the model. The baseline computes metrics and suggests constraints for the metrics. Real-time or batch predictions from your model are compared to the constraints. They are reported as violations if they are outside the constrained values.
+ Create a monitoring schedule specifying what data to collect, how often to collect it, how to analyze it, and which reports to produce. 
+ Inspect the reports, which compare the latest data with the baseline. Watch for any violations reported, metrics, and notifications from Amazon CloudWatch.

**Notes**  
Model Monitor computes model metrics and statistics on tabular data only. For example, an image classification model that takes images as input and outputs a label based on that image can still be monitored. Model Monitor would be able to calculate metrics and statistics for the output, not the input.
Model Monitor currently supports only endpoints that host a single model and does not support monitoring multi-model endpoints. For information on using multi-model endpoints, see [Multi-model endpoints](multi-model-endpoints.md).
Model Monitor supports monitoring inference pipelines. However, capturing and analyzing data is done for the entire pipeline, not for individual containers in the pipeline.
To prevent impact to inference requests, Data Capture stops capturing requests at high levels of disk usage. We recommended that you keep your disk utilization below 75% to ensure data capture continues capturing requests.
If you launch SageMaker Studio in a custom Amazon VPC, you must create VPC endpoints to let Model Monitor communicate with Amazon S3 and CloudWatch. For information about VPC endpoints, see [VPC endpoints](https://docs.aws.amazon.com/vpc/latest/privatelink/concepts.html) in the *Amazon Virtual Private Cloud User Guide*. For information about launching SageMaker Studio in a custom VPC, see [Connect Studio notebooks in a VPC to external resources](studio-notebooks-and-internet-access.md).

### Model Monitor sample notebooks
<a name="model-monitor-sample-notebooks"></a>

For a sample notebook that takes you through the end-to-end workflow using Model Monitor with your real-time endpoint, see [Introduction to Amazon SageMaker Model Monitor](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_model_monitor/introduction/SageMaker-ModelMonitoring.html).

For a sample notebook that visualizes the statistics.json file for a selected execution in a monitoring schedule, see the [Model Monitor Visualization](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_model_monitor/visualization/SageMaker-Model-Monitor-Visualize.html). 

For instructions about how to create and access Jupyter notebook instances that you can use to run the example in SageMaker AI, see [Amazon SageMaker notebook instances](nbi.md). After you have created a notebook instance and opened it, choose the **SageMaker AI Examples** tab to see a list of all the SageMaker AI samples. To open a notebook, choose the notebook's **Use** tab and choose **Create copy**.

# Data capture
<a name="model-monitor-data-capture"></a>

To log the inputs to your endpoint and the inference outputs from your deployed model to Amazon S3, you can enable a feature called *Data Capture*. *Data Capture* is commonly used to record information that can be used for training, debugging, and monitoring. Amazon SageMaker Model Monitor automatically parses this captured data and compares metrics from this data with a baseline that you create for the model. For more information about Model Monitor see [Data and model quality monitoring with Amazon SageMaker Model Monitor](model-monitor.md).

You can implement *Data Capture* for both real-time and batch model-monitor modes using the AWS SDK for Python (Boto) or the SageMaker Python SDK. For a real-time endpoint, you will specify your *Data Capture* configuration when you create your endpoint. Due to the persistent nature of your real-time endpoint, you can configure additional options to turn data capturing on or off at certain times, or change the sampling frequency. You can also choose to encrypt your inference data.

For a batch transform job, you can enable *Data Capture* if you want to run on-schedule model monitoring or continuous model-monitoring for regular, periodic batch transform jobs. You will specify your *Data Capture* configuration when you create your batch transform job. Within this configuration, you have the option to turn on encryption or generate the inference ID with your output, which helps you match your captured data to Ground Truth data.

# Capture data from real-time endpoint
<a name="model-monitor-data-capture-endpoint"></a>

**Note**  
To prevent impact to inference requests, Data Capture stops capturing requests at high levels of disk usage. It is recommended you keep your disk utilization below 75% in order to ensure data capture continues capturing requests.

To capture data for your real-time endpoint, you must deploy a model using SageMaker AI hosting services. This requires that you create a SageMaker AI model, define an endpoint configuration, and create an HTTPS endpoint.

The steps required to turn on data capture are similar whether you use the AWS SDK for Python (Boto) or the SageMaker Python SDK. If you use the AWS SDK, define the [DataCaptureConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DataCaptureConfig.html) dictionary, along with required fields, within the [CreateEndpointConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html) method to turn on data capture. If you use the SageMaker Python SDK, import the [DataCaptureConfig](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html#sagemaker.model_monitor.data_capture_config.DataCaptureConfig) Class and initialize an instance from this class. Then, pass this object to the `DataCaptureConfig` parameter in the `sagemaker.model.Model.deploy()` method.

To use the proceeding code snippets, replace the *italicized placeholder text* in the example code with your own information.

## How to enable data capture
<a name="model-monitor-data-capture-defing.title"></a>

Specify a data capture configuration. You can capture the request payload, the response payload, or both with this configuration. The proceeding code snippet demonstrates how to enable data capture using the AWS SDK for Python (Boto) and the SageMaker AI Python SDK.

**Note**  
You do not need to use Model Monitor to capture request or response payloads.

------
#### [ AWS SDK for Python (Boto) ]

Configure the data you want to capture with the [DataCaptureConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DataCaptureConfig.html) dictionary when you create an endpoint using the `CreateEndpointConfig` method. Set `EnableCapture` to the boolean value True. In addition, provide the following mandatory parameters:
+ `EndpointConfigName`: the name of your endpoint configuration. You will use this name when you make a `CreateEndpoint` request.
+ `ProductionVariants`: a list of models you want to host at this endpoint. Define a dictionary data type for each model.
+ `DataCaptureConfig`: dictionary data type where you specify an integer value that corresponds to the initial percentage of data to sample (`InitialSamplingPercentage`), the Amazon S3 URI where you want captured data to be stored, and a capture options (`CaptureOptions`) list. Specify either `Input` or `Output` for `CaptureMode` within the `CaptureOptions` list. 

You can optionally specify how SageMaker AI should encode captured data by passing key-value pair arguments to the `CaptureContentTypeHeader` dictionary.

```
# Create an endpoint config name.
endpoint_config_name = '<endpoint-config-name>'

# The name of the production variant.
variant_name = '<name-of-production-variant>'                   
  
# The name of the model that you want to host. 
# This is the name that you specified when creating the model.
model_name = '<The_name_of_your_model>'

instance_type = '<instance-type>'
#instance_type='ml.m5.xlarge' # Example    

# Number of instances to launch initially.
initial_instance_count = <integer>

# Sampling percentage. Choose an integer value between 0 and 100
initial_sampling_percentage = <integer>                                                                                                                                                                                                                        

# The S3 URI containing the captured data
s3_capture_upload_path = 's3://<bucket-name>/<data_capture_s3_key>'

# Specify either Input, Output, or both
capture_modes = [ "Input",  "Output" ] 
#capture_mode = [ "Input"] # Example - If you want to capture input only
                            
endpoint_config_response = sagemaker_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name, 
    # List of ProductionVariant objects, one for each model that you want to host at this endpoint.
    ProductionVariants=[
        {
            "VariantName": variant_name, 
            "ModelName": model_name, 
            "InstanceType": instance_type, # Specify the compute instance type.
            "InitialInstanceCount": initial_instance_count # Number of instances to launch initially.
        }
    ],
    DataCaptureConfig= {
        'EnableCapture': True, # Whether data should be captured or not.
        'InitialSamplingPercentage' : initial_sampling_percentage,
        'DestinationS3Uri': s3_capture_upload_path,
        'CaptureOptions': [{"CaptureMode" : capture_mode} for capture_mode in capture_modes] # Example - Use list comprehension to capture both Input and Output
    }
)
```

For more information about other endpoint configuration options, see the [CreateEndpointConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html) API in the [Amazon SageMaker AI Service API Reference Guide](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_Operations_Amazon_SageMaker_Service.html).

------
#### [ SageMaker Python SDK ]

Import the `DataCaptureConfig` Class from [sagemaker.model\$1monitor](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html) module. Enable data capture by setting `EnableCapture` to the boolean value `True`.

Optionally provide arguments for the following parameters:
+ `SamplingPercentage`: an integer value that corresponds to percentage of data to sample. If you do not provide a sampling percentage, SageMaker AI will sample a default of 20 (20%) of your data.
+ `DestinationS3Uri`: the Amazon S3 URI SageMaker AI will use to store captured data. If you do not provide one, SageMaker AI will store captured data in `"s3://<default-session-bucket>/ model-monitor/data-capture"`.

```
from sagemaker.model_monitor import DataCaptureConfig

# Set to True to enable data capture
enable_capture = True

# Optional - Sampling percentage. Choose an integer value between 0 and 100
sampling_percentage = <int> 
# sampling_percentage = 30 # Example 30%

# Optional - The S3 URI of stored captured-data location
s3_capture_upload_path = 's3://<bucket-name>/<data_capture_s3_key>'

# Specify either Input, Output or both. 
capture_modes = ['REQUEST','RESPONSE'] # In this example, we specify both
# capture_mode = ['REQUEST'] # Example - If you want to only capture input.

# Configuration object passed in when deploying Models to SM endpoints
data_capture_config = DataCaptureConfig(
    enable_capture = enable_capture, 
    sampling_percentage = sampling_percentage, # Optional
    destination_s3_uri = s3_capture_upload_path, # Optional
    capture_options = ["REQUEST", "RESPONSE"],
)
```

------

## Deploy your model
<a name="model-monitor-data-capture-deploy"></a>

Deploy your model and create an HTTPS endpoint with `DataCapture` enabled.

------
#### [ AWS SDK for Python (Boto3) ]

Provide the endpoint configuration to SageMaker AI. The service launches the ML compute instances and deploys the model or models as specified in the configuration.

Once you have your model and endpoint configuration, use the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html) API to create your endpoint. The endpoint name must be unique within an AWS Region in your AWS account. 

The following creates an endpoint using the endpoint configuration specified in the request. Amazon SageMaker AI uses the endpoint to provision resources and deploy models.

```
# The name of the endpoint. The name must be unique within an AWS Region in your AWS account.
endpoint_name = '<endpoint-name>' 

# The name of the endpoint configuration associated with this endpoint.
endpoint_config_name='<endpoint-config-name>'

create_endpoint_response = sagemaker_client.create_endpoint(
                                            EndpointName=endpoint_name, 
                                            EndpointConfigName=endpoint_config_name)
```

For more information, see the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html) API.

------
#### [ SageMaker Python SDK ]

Define a name for your endpoint. This step is optional. If you do not provide one, SageMaker AI will create a unique name for you:

```
from datetime import datetime

endpoint_name = f"DEMO-{datetime.utcnow():%Y-%m-%d-%H%M}"
print("EndpointName =", endpoint_name)
```

Deploy your model to a real-time, HTTPS endpoint with the Model object’s built-in `deploy()` method. Provide the name of the Amazon EC2 instance type to deploy this model to in the `instance_type` field along with the initial number of instances to run the endpoint on for the `initial_instance_count` field:

```
initial_instance_count=<integer>
# initial_instance_count=1 # Example

instance_type='<instance-type>'
# instance_type='ml.m4.xlarge' # Example

# Uncomment if you did not define this variable in the previous step
#data_capture_config = <name-of-data-capture-configuration>

model.deploy(
    initial_instance_count=initial_instance_count,
    instance_type=instance_type,
    endpoint_name=endpoint_name,
    data_capture_config=data_capture_config
)
```

------

## View Captured Data
<a name="model-monitor-data-capture-view"></a>

Create a predictor object from the SageMaker Python SDK [Predictor](https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html) Class. You will use the object returned by the `Predictor` Class to invoke your endpoint in a future step. Provide the name of your endpoint (defined earlier as `endpoint_name`), along with serializer and deserializer objects for the serializer and deserializer, respectively. For information about serializer types, see the [Serializers](https://sagemaker.readthedocs.io/en/stable/api/inference/serializers.html) Class in the [SageMaker AI Python SDK](https://sagemaker.readthedocs.io/en/stable/index.html).

```
from sagemaker.predictor import Predictor
from sagemaker.serializers import <Serializer>
from sagemaker.deserializers import <Deserializers>

predictor = Predictor(endpoint_name=endpoint_name,
                      serializer = <Serializer_Class>,
                      deserializer = <Deserializer_Class>)

# Example
#from sagemaker.predictor import Predictor
#from sagemaker.serializers import CSVSerializer
#from sagemaker.deserializers import JSONDeserializer

#predictor = Predictor(endpoint_name=endpoint_name,
#                      serializer=CSVSerializer(),
#                      deserializer=JSONDeserializer())
```

In the proceeding code example scenario we invoke the endpoint with sample validation data that we have stored locally in a CSV file named `validation_with_predictions`. Our sample validation set contains labels for each input.

The first few lines of the with statement first opens the validation set CSV file, then splits each row within the file by the comma character `","`, and then stores the two returned objects into a label and input\$1cols variables. For each row, the input (`input_cols`) is passed to the predictor variable's (`predictor`) objects built-in method `Predictor.predict()`.

Suppose the model returns a probability. Probabilities range between integer values of 0 and 1.0. If the probability returned by the model is greater than 80% (0.8) we assign the prediction an integer value label of 1. Otherwise, we assign the prediction an integer value label of 0.

```
from time import sleep

validate_dataset = "validation_with_predictions.csv"

# Cut off threshold of 80%
cutoff = 0.8

limit = 200  # Need at least 200 samples to compute standard deviations
i = 0
with open(f"test_data/{validate_dataset}", "w") as validation_file:
    validation_file.write("probability,prediction,label\n")  # CSV header
    with open("test_data/validation.csv", "r") as f:
        for row in f:
            (label, input_cols) = row.split(",", 1)
            probability = float(predictor.predict(input_cols))
            prediction = "1" if probability > cutoff else "0"
            baseline_file.write(f"{probability},{prediction},{label}\n")
            i += 1
            if i > limit:
                break
            print(".", end="", flush=True)
            sleep(0.5)
print()
print("Done!")
```

Because you enabled the data capture in the previous steps, the request and response payload, along with some additional meta data, is saved in the Amazon S3 location that you specified in `DataCaptureConfig`. The delivery of capture data to Amazon S3 can require a couple of minutes.

View captured data by listing the data capture files stored in Amazon S3. The format of the Amazon S3 path is: `s3:///{endpoint-name}/{variant-name}/yyyy/mm/dd/hh/filename.jsonl`.

Expect to see different files from different time periods, organized based on the hour when the invocation occurred. Run the following to print out the contents of a single capture file:

```
print("\n".join(capture_file[-3:-1]))
```

This will return a SageMaker AI specific JSON-line formatted file. The following is a response sample taken from a real-time endpoint that we invoked using `csv/text` data:

```
{"captureData":{"endpointInput":{"observedContentType":"text/csv","mode":"INPUT",
"data":"69,0,153.7,109,194.0,105,256.1,114,14.1,6,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0\n",
"encoding":"CSV"},"endpointOutput":{"observedContentType":"text/csv; charset=utf-8","mode":"OUTPUT","data":"0.0254181120544672","encoding":"CSV"}},
"eventMetadata":{"eventId":"aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee","inferenceTime":"2022-02-14T17:25:49Z"},"eventVersion":"0"}
{"captureData":{"endpointInput":{"observedContentType":"text/csv","mode":"INPUT",
"data":"94,23,197.1,125,214.5,136,282.2,103,9.5,5,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,1,0,1\n",
"encoding":"CSV"},"endpointOutput":{"observedContentType":"text/csv; charset=utf-8","mode":"OUTPUT","data":"0.07675473392009735","encoding":"CSV"}},
"eventMetadata":{"eventId":"aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee","inferenceTime":"2022-02-14T17:25:49Z"},"eventVersion":"0"}
```

In the proceeding example, the `capture_file` object is a list type. Index the first element of the list to view a single inference request.

```
# The capture_file object is a list. Index the first element to view a single inference request  
print(json.dumps(json.loads(capture_file[0]), indent=2))
```

This will return a response similar to the following. The values returned will differ based on your endpoint configuration, SageMaker AI model, and captured data:

```
{
  "captureData": {
    "endpointInput": {
      "observedContentType": "text/csv", # data MIME type
      "mode": "INPUT",
      "data": "50,0,188.9,94,203.9,104,151.8,124,11.6,8,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1,0,1,0\n",
      "encoding": "CSV"
    },
    "endpointOutput": {
      "observedContentType": "text/csv; charset=character-encoding",
      "mode": "OUTPUT",
      "data": "0.023190177977085114",
      "encoding": "CSV"
    }
  },
  "eventMetadata": {
    "eventId": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
    "inferenceTime": "2022-02-14T17:25:06Z"
  },
  "eventVersion": "0"
}
```

# Capture data from batch transform job
<a name="model-monitor-data-capture-batch"></a>

 The steps required to turn on data capture for your batch transform job are similar whether you use the AWS SDK for Python (Boto) or the SageMaker Python SDK. If you use the AWS SDK, define the [DataCaptureConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DataCaptureConfig.html) dictionary, along with required fields, within the `CreateTransformJob` method to turn on data capture. If you use the SageMaker AI Python SDK, import the `BatchDataCaptureConfig` class and initialize an instance from this class. Then, pass this object to the `batch_data_capture_config` parameter of your transform job instance. 

 To use the following code snippets, replace the *italicized placeholder text* in the example code with your own information. 

## How to enable data capture
<a name="data-capture-batch-enable"></a>

 Specify a data capture configuration when you launch a transform job. Whether you use the AWS SDK for Python (Boto3) or the SageMaker Python SDK, you must provide the `DestinationS3Uri` argument, which is the directory where you want the transform job to log the captured data. Optionally, you can also set the following parameters: 
+  `KmsKeyId`: The AWS KMS key used to encrypt the captured data. 
+  `GenerateInferenceId`: A Boolean flag that, when capturing the data, indicates if you want the transform job to append the inference ID and time to your output. This is useful for model quality monitoring, where you need to ingest the Ground Truth data. The inference ID and time help to match the captured data with your Ground Truth data. 

------
#### [ AWS SDK for Python (Boto3) ]

 Configure the data you want to capture with the [DataCaptureConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DataCaptureConfig.html) dictionary when you create a transform job using the `CreateTransformJob` method. 

```
input_data_s3_uri = "s3://input_S3_uri"
output_data_s3_uri = "s3://output_S3_uri"
data_capture_destination = "s3://captured_data_S3_uri"

model_name = "model_name"

sm_client.create_transform_job(
    TransformJobName="transform_job_name",
    MaxConcurrentTransforms=2,
    ModelName=model_name,
    TransformInput={
        "DataSource": {
            "S3DataSource": {
                "S3DataType": "S3Prefix",
                "S3Uri": input_data_s3_uri,
            }
        },
        "ContentType": "text/csv",
        "CompressionType": "None",
        "SplitType": "Line",
    },
    TransformOutput={
        "S3OutputPath": output_data_s3_uri,
        "Accept": "text/csv",
        "AssembleWith": "Line",
    },
    TransformResources={
        "InstanceType": "ml.m4.xlarge",
        "InstanceCount": 1,
    },
    DataCaptureConfig={
       "DestinationS3Uri": data_capture_destination,
       "KmsKeyId": "kms_key",
       "GenerateInferenceId": True,
    }
    )
```

------
#### [ SageMaker Python SDK ]

 Import the `BatchDataCaptureConfig` class from the [sagemaker.model\$1monitor](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html). 

```
from sagemaker.transformer import Transformer
from sagemaker.inputs import BatchDataCaptureConfig

# Optional - The S3 URI of where to store captured data in S3
data_capture_destination = "s3://captured_data_S3_uri"

model_name = "model_name"

transformer = Transformer(model_name=model_name, ...)
transform_arg = transformer.transform(
    batch_data_capture_config=BatchDataCaptureConfig(
        destination_s3_uri=data_capture_destination,
        kms_key_id="kms_key",
        generate_inference_id=True,
    ),
    ...
)
```

------

## How to view data captured
<a name="data-capture-batch-view"></a>

 Once the transform job completes, the captured data gets logged under the `DestinationS3Uri` you provided with the data capture configuration. There are two subdirectories under `DestinationS3Uri`, `/input` and `/output`. If `DestinationS3Uri` is `s3://my-data-capture`, then the transform job creates the the following directories: 
+  `s3://my-data-capture/input`: The captured input data for the transform job. 
+  `s3://my-data-capture/output`: The captured output data for the transform job. 

 To avoid data duplication, the captured data under the preceding two directories are manifests. Each manifest is a JSONL file that contains the Amazon S3 locations of the source objects. A manifest file may look like the following example: 

```
# under "/input" directory
[
    {"prefix":"s3://input_S3_uri/"},
    "dummy_0.csv",
    "dummy_1.csv",
    "dummy_2.csv",
    ...
]

# under "/output" directory
[
    {"prefix":"s3://output_S3_uri/"},
    "dummy_0.csv.out",
    "dummy_1.csv.out",
    "dummy_2.csv.out",
    ...
]
```

 The transform job organizes and labels these manifests with a *yyyy/mm/dd/hh* S3 prefix to indicate when they were captured. This helps the model monitor determine the appropriate portion of data to analyze. For example, if you start your transform job at 2022-8-26 13PM UTC, then the captured data is labeled with a `2022/08/26/13/` prefix string. 

## InferenceId Generation
<a name="data-capture-batch-inferenceid"></a>

 When you configure `DataCaptureConfig` for a transform job, you can turn on the Boolean flag `GenerateInferenceId`. This is particularly useful when you need to run model quality and model bias monitoring jobs, for which you need user-ingested Ground Truth data. Model monitor relies on an inference ID to match the captured data and the Ground Truth data. For addition details about Ground Truth ingestion, see [Ingest Ground Truth labels and merge them with predictions](model-monitor-model-quality-merge.md). When `GenerateInferenceId` is on, the transform output appends an inference ID (a random UUID) as well as the transform job start time in UTC for each record. You need these two values to run model quality and model bias monitoring. When you construct the Ground Truth data, you need to provide the same inference ID to match the output data. Currently, this feature supports transform outputs in CSV, JSON, and JSONL formats. 

 If your transform output is in CSV format, the output file looks like the following example: 

```
0, 1f1d57b1-2e6f-488c-8c30-db4e6d757861,2022-08-30T00:49:15Z
1, 22445434-0c67-45e9-bb4d-bd1bf26561e6,2022-08-30T00:49:15Z
...
```

 The last two columns are inference ID and the transform job start time. Do not modify them. The remaining columns are your transform job outputs. 

 If your transform output is in JSON or JSONL format, the output file looks like the following example: 

```
{"output": 0, "SageMakerInferenceId": "1f1d57b1-2e6f-488c-8c30-db4e6d757861", "SageMakerInferenceTime": "2022-08-30T00:49:15Z"}
{"output": 1, "SageMakerInferenceId": "22445434-0c67-45e9-bb4d-bd1bf26561e6", "SageMakerInferenceTime": "2022-08-30T00:49:15Z"}
...
```

 There are two appended fields that are reserved, `SageMakerInferenceId` and `SageMakerInferenceTime`. Do not modify these fields if you need to run model quality or model bias monitoring — you need them for merge jobs. 

# Data quality
<a name="model-monitor-data-quality"></a>

Data quality monitoring automatically monitors machine learning (ML) models in production and notifies you when data quality issues arise. ML models in production have to make predictions on real-life data that is not carefully curated like most training datasets. If the statistical nature of the data that your model receives while in production drifts away from the nature of the baseline data it was trained on, the model begins to lose accuracy in its predictions. Amazon SageMaker Model Monitor uses rules to detect data drift and alerts you when it happens. To monitor data quality, follow these steps:
+ Enable data capture. This captures inference input and output from a real-time inference endpoint or batch transform job and stores the data in Amazon S3. For more information, see [Data capture](model-monitor-data-capture.md).
+ Create a baseline. In this step, you run a baseline job that analyzes an input dataset that you provide. The baseline computes baseline schema constraints and statistics for each feature using [Deequ](https://github.com/awslabs/deequ), an open source library built on Apache Spark, which is used to measure data quality in large datasets. For more information, see [Create a Baseline](model-monitor-create-baseline.md).
+ Define and schedule data quality monitoring jobs. For specific information and code samples of data quality monitoring jobs, see [Schedule data quality monitoring jobs](model-monitor-schedule-data-monitor.md). For general information about monitoring jobs, see [Schedule monitoring jobs](model-monitor-scheduling.md).
  + Optionally use preprocessing and postprocessing scripts to transform the data coming out of your data quality analysis. For more information, see [Preprocessing and Postprocessing](model-monitor-pre-and-post-processing.md).
+ View data quality metrics. For more information, see [Schema for Statistics (statistics.json file)](model-monitor-interpreting-statistics.md).
+ Integrate data quality monitoring with Amazon CloudWatch. For more information, see [CloudWatch Metrics](model-monitor-interpreting-cloudwatch.md).
+ Interpret the results of a monitoring job. For more information, see [Interpret results](model-monitor-interpreting-results.md).
+ Use SageMaker Studio to enable data quality monitoring and visualize results if you are using a real-time endpoint. For more information, see [Visualize results for real-time endpoints in Amazon SageMaker Studio](model-monitor-interpreting-visualize-results.md).

**Note**  
Model Monitor computes model metrics and statistics on tabular data only. For example, an image classification model that takes images as input and outputs a label based on that image can still be monitored. Model Monitor would be able to calculate metrics and statistics for the output, not the input.

**Topics**
+ [Create a Baseline](model-monitor-create-baseline.md)
+ [Schedule data quality monitoring jobs](model-monitor-schedule-data-monitor.md)
+ [Schema for Statistics (statistics.json file)](model-monitor-interpreting-statistics.md)
+ [CloudWatch Metrics](model-monitor-interpreting-cloudwatch.md)
+ [Schema for Violations (constraint\$1violations.json file)](model-monitor-interpreting-violations.md)

# Create a Baseline
<a name="model-monitor-create-baseline"></a>

The baseline calculations of statistics and constraints are needed as a standard against which data drift and other data quality issues can be detected. Model Monitor provides a built-in container that provides the ability to suggest the constraints automatically for CSV and flat JSON input. This *sagemaker-model-monitor-analyzer* container also provides you with a range of model monitoring capabilities, including constraint validation against a baseline, and emitting Amazon CloudWatch metrics. This container is based on Spark version 3.3.0 and is built with [Deequ](https://github.com/awslabs/deequ) version 2.0.2. All column names in your baseline dataset must be compliant with Spark. For column names, use only lowercase characters, and `_` as the only special character.

The training dataset that you used to train the model is usually a good baseline dataset. The training dataset data schema and the inference dataset schema should exactly match (the number and order of the features). Note that the prediction/output columns are assumed to be the first columns in the training dataset. From the training dataset, you can ask SageMaker AI to suggest a set of baseline constraints and generate descriptive statistics to explore the data. For this example, upload the training dataset that was used to train the pretrained model included in this example. If you already stored the training dataset in Amazon S3, you can point to it directly.

**To Create a baseline from a training dataset** 

When you have your training data ready and stored in Amazon S3, start a baseline processing job with `DefaultModelMonitor.suggest_baseline(..)` using the [Amazon SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable). This uses an [Amazon SageMaker Model Monitor prebuilt container](model-monitor-pre-built-container.md) that generates baseline statistics and suggests baseline constraints for the dataset and writes them to the `output_s3_uri` location that you specify.

```
from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor.dataset_format import DatasetFormat

my_default_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type='ml.m5.xlarge',
    volume_size_in_gb=20,
    max_runtime_in_seconds=3600,
)

my_default_monitor.suggest_baseline(
    baseline_dataset=baseline_data_uri+'/training-dataset-with-header.csv',
    dataset_format=DatasetFormat.csv(header=True),
    output_s3_uri=baseline_results_uri,
    wait=True
)
```

**Note**  
If you provide the feature/column names in the training dataset as the first row and set the `header=True` option as shown in the previous code sample, SageMaker AI uses the feature name in the constraints and statistics file.

The baseline statistics for the dataset are contained in the statistics.json file and the suggested baseline constraints are contained in the constraints.json file in the location you specify with `output_s3_uri`.

Output Files for Tabular Dataset Statistics and Constraints


| File Name | Description | 
| --- | --- | 
| statistics.json |  This file is expected to have columnar statistics for each feature in the dataset that is analyzed. For more information about the schema for this file, see [Schema for Statistics (statistics.json file)](model-monitor-byoc-statistics.md).  | 
| constraints.json |  This file is expected to have the constraints on the features observed. For more information about the schema for this file, see [Schema for Constraints (constraints.json file)](model-monitor-byoc-constraints.md).  | 

The [Amazon SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable) provides convenience functions described to generate the baseline statistics and constraints. But if you want to call processing job directly for this purpose instead, you need to set the `Environment` map as shown in the following example:

```
"Environment": {
    "dataset_format": "{\"csv\”: { \”header\”: true}",
    "dataset_source": "/opt/ml/processing/sm_input",
    "output_path": "/opt/ml/processing/sm_output",
    "publish_cloudwatch_metrics": "Disabled",
}
```

# Schedule data quality monitoring jobs
<a name="model-monitor-schedule-data-monitor"></a>

After you create your baseline, you can call the `create_monitoring_schedule()` method of your `DefaultModelMonitor` class instance to schedule an hourly data quality monitor. The following sections show you how to create a data quality monitor for a model deployed to a real-time endpoint as well as for a batch transform job.

**Important**  
You can specify either a batch transform input or an endpoint input, but not both, when you create your monitoring schedule.

## Data quality monitoring for models deployed to real-time endpoints
<a name="model-monitor-data-quality-rt"></a>

To schedule a data quality monitor for a real-time endpoint, pass your `EndpointInput` instance to the `endpoint_input` argument of your `DefaultModelMonitor` instance, as shown in the following code sample:

```
from sagemaker.model_monitor import CronExpressionGenerator
                
data_quality_model_monitor = DefaultModelMonitor(
   role=sagemaker.get_execution_role(),
   ...
)

schedule = data_quality_model_monitor.create_monitoring_schedule(
   monitor_schedule_name=schedule_name,
   post_analytics_processor_script=s3_code_postprocessor_uri,
   output_s3_uri=s3_report_path,
   schedule_cron_expression=CronExpressionGenerator.hourly(),
   statistics=data_quality_model_monitor.baseline_statistics(),
   constraints=data_quality_model_monitor.suggested_constraints(),
   schedule_cron_expression=CronExpressionGenerator.hourly(),
   enable_cloudwatch_metrics=True,
   endpoint_input=EndpointInput(
        endpoint_name=endpoint_name,
        destination="/opt/ml/processing/input/endpoint",
   )
)
```

## Data quality monitoring for batch transform jobs
<a name="model-monitor-data-quality-bt"></a>

To schedule a data quality monitor for a batch transform job, pass your `BatchTransformInput` instance to the `batch_transform_input` argument of your `DefaultModelMonitor` instance, as shown in the following code sample:

```
from sagemaker.model_monitor import CronExpressionGenerator
                
data_quality_model_monitor = DefaultModelMonitor(
   role=sagemaker.get_execution_role(),
   ...
)

schedule = data_quality_model_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    batch_transform_input=BatchTransformInput(
        data_captured_destination_s3_uri=s3_capture_upload_path,
        destination="/opt/ml/processing/input",
        dataset_format=MonitoringDatasetFormat.csv(header=False),
    ),
    output_s3_uri=s3_report_path,
    statistics= statistics_path,
    constraints = constraints_path,
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)
```

# Schema for Statistics (statistics.json file)
<a name="model-monitor-interpreting-statistics"></a>

Amazon SageMaker Model Monitor prebuilt container computes per column/feature statistics. The statistics are calculated for the baseline dataset and also for the current dataset that is being analyzed.

```
{
    "version": 0,
    # dataset level stats
    "dataset": {
        "item_count": number
    },
    # feature level stats
    "features": [
        {
            "name": "feature-name",
            "inferred_type": "Fractional" | "Integral",
            "numerical_statistics": {
                "common": {
                    "num_present": number,
                    "num_missing": number
                },
                "mean": number,
                "sum": number,
                "std_dev": number,
                "min": number,
                "max": number,
                "distribution": {
                    "kll": {
                        "buckets": [
                            {
                                "lower_bound": number,
                                "upper_bound": number,
                                "count": number
                            }
                        ],
                        "sketch": {
                            "parameters": {
                                "c": number,
                                "k": number
                            },
                            "data": [
                                [
                                    num,
                                    num,
                                    num,
                                    num
                                ],
                                [
                                    num,
                                    num
                                ][
                                    num,
                                    num
                                ]
                            ]
                        }#sketch
                    }#KLL
                }#distribution
            }#num_stats
        },
        {
            "name": "feature-name",
            "inferred_type": "String",
            "string_statistics": {
                "common": {
                    "num_present": number,
                    "num_missing": number
                },
                "distinct_count": number,
                "distribution": {
                    "categorical": {
                         "buckets": [
                                {
                                    "value": "string",
                                    "count": number
                                }
                          ]
                     }
                }
            },
            #provision for custom stats
        }
    ]
}
```

Note the following:
+ The prebuilt containers compute [KLL sketch](https://datasketches.apache.org/docs/KLL/KLLSketch.html), which is a compact quantiles sketch.
+ By default, we materialize the distribution in 10 buckets. This is not currently configurable.

# CloudWatch Metrics
<a name="model-monitor-interpreting-cloudwatch"></a>

You can use the built-in Amazon SageMaker Model Monitor container for CloudWatch metrics. When the `emit_metrics` option is `Enabled` in the baseline constraints file, SageMaker AI emits these metrics for each feature/column observed in the dataset in the following namespace:
+ `For real-time endpoints: /aws/sagemaker/Endpoints/data-metric` namespace with `EndpointName` and `ScheduleName` dimensions.
+ `For batch transform jobs: /aws/sagemaker/ModelMonitoring/data-metric` namespace with `MonitoringSchedule` dimension.

For numerical fields, the built-in container emits the following CloudWatch metrics:
+ Metric: Max → query for `MetricName: feature_data_{feature_name}, Stat: Max`
+ Metric: Min → query for `MetricName: feature_data_{feature_name}, Stat: Min`
+ Metric: Sum → query for `MetricName: feature_data_{feature_name}, Stat: Sum`
+ Metric: SampleCount → query for `MetricName: feature_data_{feature_name}, Stat: SampleCount`
+ Metric: Average → query for `MetricName: feature_data_{feature_name}, Stat: Average`

For both numerical and string fields, the built-in container emits the following CloudWatch metrics:
+ Metric: Completeness → query for `MetricName: feature_non_null_{feature_name}, Stat: Sum`
+ Metric: Baseline Drift → query for `MetricName: feature_baseline_drift_{feature_name}, Stat: Sum`

# Schema for Violations (constraint\$1violations.json file)
<a name="model-monitor-interpreting-violations"></a>

The violations file is generated as the output of a `MonitoringExecution`, which lists the results of evaluating the constraints (specified in the constraints.json file) against the current dataset that was analyzed. The Amazon SageMaker Model Monitor prebuilt container provides the following violation checks.

```
{
    "violations": [{
      "feature_name" : "string",
      "constraint_check_type" :
              "data_type_check",
            | "completeness_check",
            | "baseline_drift_check",
            | "missing_column_check",
            | "extra_column_check",
            | "categorical_values_check"
      "description" : "string"
    }]
}
```

Types of Violations Monitored 


| Violation Check Type | Description  | 
| --- | --- | 
| data\$1type\$1check | If the data types in the current execution are not the same as in the baseline dataset, this violation is flagged. During the baseline step, the generated constraints suggest the inferred data type for each column. The `monitoring_config.datatype_check_threshold` parameter can be tuned to adjust the threshold on when it is flagged as a violation.  | 
| completeness\$1check | If the completeness (% of non-null items) observed in the current execution exceeds the threshold specified in completeness threshold specified per feature, this violation is flagged. During the baseline step, the generated constraints suggest a completeness value.   | 
| baseline\$1drift\$1check | If the calculated distribution distance between the current and the baseline datasets is more than the threshold specified in `monitoring_config.comparison_threshold`, this violation is flagged.  | 
| missing\$1column\$1check | If the number of columns in the current dataset is less than the number in the baseline dataset, this violation is flagged.  | 
| extra\$1column\$1check | If the number of columns in the current dataset is more than the number in the baseline, this violation is flagged.  | 
| categorical\$1values\$1check | If there are more unknown values in the current dataset than in the baseline dataset, this violation is flagged. This value is dictated by the threshold in `monitoring_config.domain_content_threshold`.  | 

# Model quality
<a name="model-monitor-model-quality"></a>

Model quality monitoring jobs monitor the performance of a model by comparing the predictions that the model makes with the actual Ground Truth labels that the model attempts to predict. To do this, model quality monitoring merges data that is captured from real-time or batch inference with actual labels that you store in an Amazon S3 bucket, and then compares the predictions with the actual labels.

To measure model quality, model monitor uses metrics that depend on the ML problem type. For example, if your model is for a regression problem, one of the metrics evaluated is mean square error (mse). For information about all of the metrics used for the different ML problem types, see [Model quality metrics and Amazon CloudWatch monitoring](model-monitor-model-quality-metrics.md). 

Model quality monitoring follows the same steps as data quality monitoring, but adds the additional step of merging the actual labels from Amazon S3 with the predictions captured from the real-time inference endpoint or batch transform job. To monitor model quality, follow these steps:
+ Enable data capture. This captures inference input and output from a real-time inference endpoint or batch transform job and stores the data in Amazon S3. For more information, see [Data capture](model-monitor-data-capture.md).
+ Create a baseline. In this step, you run a baseline job that compares predictions from the model with Ground Truth labels in a baseline dataset. The baseline job automatically creates baseline statistical rules and constraints that define thresholds against which the model performance is evaluated. For more information, see [Create a model quality baseline](model-monitor-model-quality-baseline.md).
+ Define and schedule model quality monitoring jobs. For specific information and code samples of model quality monitoring jobs, see [Schedule model quality monitoring jobs](model-monitor-model-quality-schedule.md). For general information about monitoring jobs, see [Schedule monitoring jobs](model-monitor-scheduling.md).
+ Ingest Ground Truth labels that model monitor merges with captured prediction data from a real-time inference endpoint or batch transform job. For more information, see [Ingest Ground Truth labels and merge them with predictions](model-monitor-model-quality-merge.md).
+ Integrate model quality monitoring with Amazon CloudWatch. For more information, see [Monitoring model quality metrics with CloudWatch](model-monitor-model-quality-metrics.md#model-monitor-model-quality-cw).
+ Interpret the results of a monitoring job. For more information, see [Interpret results](model-monitor-interpreting-results.md).
+ Use SageMaker Studio to enable model quality monitoring and visualize results. For more information, see [Visualize results for real-time endpoints in Amazon SageMaker Studio](model-monitor-interpreting-visualize-results.md).

**Topics**
+ [Create a model quality baseline](model-monitor-model-quality-baseline.md)
+ [Schedule model quality monitoring jobs](model-monitor-model-quality-schedule.md)
+ [Ingest Ground Truth labels and merge them with predictions](model-monitor-model-quality-merge.md)
+ [Model quality metrics and Amazon CloudWatch monitoring](model-monitor-model-quality-metrics.md)

# Create a model quality baseline
<a name="model-monitor-model-quality-baseline"></a>

Create a baseline job that compares your model predictions with ground truth labels in a baseline dataset that you have stored in Amazon S3. Typically, you use a training dataset as the baseline dataset. The baseline job calculates metrics for the model and suggests constraints to use to monitor model quality drift.

To create a baseline job, you need to have a dataset that contains predictions from your model along with labels that represent the Ground Truth for your data.

To create a baseline job use the `ModelQualityMonitor` class provided by the SageMaker Python SDK, and complete the following steps.

**To create a model quality baseline job**

1.  First, create an instance of the `ModelQualityMonitor` class. The following code snippet shows how to do this.

   ```
   from sagemaker import get_execution_role, session, Session
   from sagemaker.model_monitor import ModelQualityMonitor
                   
   role = get_execution_role()
   session = Session()
   
   model_quality_monitor = ModelQualityMonitor(
       role=role,
       instance_count=1,
       instance_type='ml.m5.xlarge',
       volume_size_in_gb=20,
       max_runtime_in_seconds=1800,
       sagemaker_session=session
   )
   ```

1. Now call the `suggest_baseline` method of the `ModelQualityMonitor` object to run a baseline job. The following code snippet assumes that you have a baseline dataset that contains both predictions and labels stored in Amazon S3.

   ```
   baseline_job_name = "MyBaseLineJob"
   job = model_quality_monitor.suggest_baseline(
       job_name=baseline_job_name,
       baseline_dataset=baseline_dataset_uri, # The S3 location of the validation dataset.
       dataset_format=DatasetFormat.csv(header=True),
       output_s3_uri = baseline_results_uri, # The S3 location to store the results.
       problem_type='BinaryClassification',
       inference_attribute= "prediction", # The column in the dataset that contains predictions.
       probability_attribute= "probability", # The column in the dataset that contains probabilities.
       ground_truth_attribute= "label" # The column in the dataset that contains ground truth labels.
   )
   job.wait(logs=False)
   ```

1. After the baseline job finishes, you can see the constraints that the job generated. First, get the results of the baseline job by calling the `latest_baselining_job` method of the `ModelQualityMonitor` object.

   ```
   baseline_job = model_quality_monitor.latest_baselining_job
   ```

1. The baseline job suggests constraints, which are thresholds for metrics that model monitor measures. If a metric goes beyond the suggested threshold, Model Monitor reports a violation. To view the constraints that the baseline job generated, call the `suggested_constraints` method of the baseline job. The following code snippet loads the constraints for a binary classification model into a Pandas dataframe.

   ```
   import pandas as pd
   pd.DataFrame(baseline_job.suggested_constraints().body_dict["binary_classification_constraints"]).T
   ```

   We recommend that you view the generated constraints and modify them as necessary before using them for monitoring. For example, if a constraint is too aggressive, you might get more alerts for violations than you want.

   If your constraint contains numbers expressed in scientific notation, you will need to convert them to float. The following python [preprocessing script](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-pre-and-post-processing.html#model-monitor-pre-processing-script) example shows how to convert numbers in scientific notation to float. 

   ```
   import csv
   
   def fix_scientific_notation(col):
       try:
           return format(float(col), "f")
       except:
           return col
   
   def preprocess_handler(csv_line):
       reader = csv.reader([csv_line])
       csv_record = next(reader)
       #skip baseline header, change HEADER_NAME to the first column's name
       if csv_record[0] == “HEADER_NAME”:
          return []
       return { str(i).zfill(20) : fix_scientific_notation(d) for i, d in enumerate(csv_record)}
   ```

   You can add your pre-processing script to a baseline or monitoring schedule as a `record_preprocessor_script`, as defined in the [Model Monitor](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html) documentation.

1. When you are satisfied with the constraints, pass them as the `constraints` parameter when you create a monitoring schedule. For more information, see [Schedule model quality monitoring jobs](model-monitor-model-quality-schedule.md).

The suggested baseline constraints are contained in the constraints.json file in the location you specify with `output_s3_uri`. For information about the schema for this file in the [Schema for Constraints (constraints.json file)](model-monitor-byoc-constraints.md).

# Schedule model quality monitoring jobs
<a name="model-monitor-model-quality-schedule"></a>

After you create your baseline, you can call the `create_monitoring_schedule()` method of your `ModelQualityMonitor` class instance to schedule an hourly model quality monitor. The following sections show you how to create a model quality monitor for a model deployed to a real-time endpoint as well as for a batch transform job.

**Important**  
You can specify either a batch transform input or an endpoint input, but not both, when you create your monitoring schedule.

Unlike data quality monitoring, you need to supply Ground Truth labels if you want to monitor model quality. However, Ground Truth labels could be delayed. To address this, specify offsets when you create your monitoring schedule. 

## Model monitor offsets
<a name="model-monitor-model-quality-schedule-offsets"></a>

Model quality jobs include `StartTimeOffset` and `EndTimeOffset`, which are fields of the `ModelQualityJobInput` parameter of the `create_model_quality_job_definition` method that work as follows:
+ `StartTimeOffset` - If specified, jobs subtract this time from the start time.
+ `EndTimeOffset` - If specified, jobs subtract this time from the end time.

The format of the offsets are, for example, -PT7H, where 7H is 7 hours. You can use -PT\$1H or -P\$1D, where H=hours, D=days, and M=minutes, and \$1 is the number. In addition, the offset should be in [ISO 8601 duration format](https://en.wikipedia.org/wiki/ISO_8601#Durations).

For example, if your Ground Truth starts coming in after 1 day, but is not complete for a week, set `StartTimeOffset` to `-P8D` and `EndTimeOffset` to `-P1D`. Then, if you schedule a job to run at `2020-01-09T13:00`, it analyzes data from between `2020-01-01T13:00` and `2020-01-08T13:00`.

**Important**  
The schedule cadence should be such that one execution finishes before the next execution starts, which allows the Ground Truth merge job and monitoring job from the execution to complete. The maximum runtime of an execution is divided between the two jobs, so for an hourly model quality monitoring job, the value of `MaxRuntimeInSeconds` specified as part of `StoppingCondition` should be no more than 1800.

## Model quality monitoring for models deployed to real-time endpoints
<a name="model-monitor-data-quality-schedule-rt"></a>

To schedule a model quality monitor for a real-time endpoint, pass your `EndpointInput` instance to the `endpoint_input` argument of your `ModelQualityMonitor` instance, as shown in the following code sample:

```
from sagemaker.model_monitor import CronExpressionGenerator
                    
model_quality_model_monitor = ModelQualityMonitor(
   role=sagemaker.get_execution_role(),
   ...
)

schedule = model_quality_model_monitor.create_monitoring_schedule(
   monitor_schedule_name=schedule_name,
   post_analytics_processor_script=s3_code_postprocessor_uri,
   output_s3_uri=s3_report_path,
   schedule_cron_expression=CronExpressionGenerator.hourly(),    
   statistics=model_quality_model_monitor.baseline_statistics(),
   constraints=model_quality_model_monitor.suggested_constraints(),
   schedule_cron_expression=CronExpressionGenerator.hourly(),
   enable_cloudwatch_metrics=True,
   endpoint_input=EndpointInput(
        endpoint_name=endpoint_name,
        destination="/opt/ml/processing/input/endpoint",
        start_time_offset="-PT2D",
        end_time_offset="-PT1D",
    )
)
```

## Model quality monitoring for batch transform jobs
<a name="model-monitor-data-quality-schedule-tt"></a>

To schedule a model quality monitor for a batch transform job, pass your `BatchTransformInput` instance to the `batch_transform_input` argument of your `ModelQualityMonitor` instance, as shown in the following code sample:

```
from sagemaker.model_monitor import CronExpressionGenerator

model_quality_model_monitor = ModelQualityMonitor(
   role=sagemaker.get_execution_role(),
   ...
)

schedule = model_quality_model_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    batch_transform_input=BatchTransformInput(
        data_captured_destination_s3_uri=s3_capture_upload_path,
        destination="/opt/ml/processing/input",
        dataset_format=MonitoringDatasetFormat.csv(header=False),
        # the column index of the output representing the inference probablity
        probability_attribute="0",
        # the threshold to classify the inference probablity to class 0 or 1 in 
        # binary classification problem
        probability_threshold_attribute=0.5,
        # look back 6 hour for transform job outputs.
        start_time_offset="-PT6H",
        end_time_offset="-PT0H"
    ),
    ground_truth_input=gt_s3_uri,
    output_s3_uri=s3_report_path,
    problem_type="BinaryClassification",
    constraints = constraints_path,
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)
```

# Ingest Ground Truth labels and merge them with predictions
<a name="model-monitor-model-quality-merge"></a>

Model quality monitoring compares the predictions your model makes with ground truth labels to measure the quality of the model. For this to work, you periodically label data captured by your endpoint or batch transform job and upload it to Amazon S3.

To match Ground Truth labels with captured prediction data, there must be a unique identifier for each record in the dataset. The structure of each record for ground truth data is as follows:

```
{
  "groundTruthData": {
    "data": "1",
    "encoding": "CSV"
  },
  "eventMetadata": {
    "eventId": "aaaa-bbbb-cccc"
  },
  "eventVersion": "0"
}
```

In the `groundTruthData` structure, `eventId` can be one of the following:
+ `eventId` – This ID is automatically generated when a user invokes the endpoint.
+ `inferenceId` – The caller supplies this ID when they invoke the endpoint.

If `inferenceId` is present in captured data records, Model Monitor uses it to merge captured data with Ground Truth records. You are responsible for making sure that the `inferenceId` in the Ground Truth records match the `inferenceId` in the captured records. If `inferenceId` is not present in captured data, model monitor uses `eventId` from the captured data records to match them with a Ground Truth record.

You must upload Ground Truth data to an Amazon S3 bucket that has the same path format as captured data. 

**Data format requirements**  
When you save your data to Amazon S3 it must use the jsonlines format (.jsonl), and be saved using the following naming structure. To learn more about jsonline requirements, see [Use input and output data](sms-data.md). 

```
s3://amzn-s3-demo-bucket1/prefix/yyyy/mm/dd/hh
```

The date in this path is the date when the Ground Truth label is collected, and does not have to match the date when the inference was generated.

After you create and upload the Ground Truth labels, include the location of the labels as a parameter when you create the monitoring job. If you are using AWS SDK for Python (Boto3), do this by specifying the location of Ground Truth labels as the `S3Uri` field of the `GroundTruthS3Input` parameter in a call to the `create_model_quality_job_definition` method. If you are using the SageMaker Python SDK, specify the location of the Ground Truth labels as the `ground_truth_input` parameter in the call to the `create_monitoring_schedule` of the `ModelQualityMonitor` object.

# Model quality metrics and Amazon CloudWatch monitoring
<a name="model-monitor-model-quality-metrics"></a>

Model quality monitoring jobs compute different metrics to evaluate the quality and performance of your machine learning models. The specific metrics calculated depend on the type of ML problem: regression, binary classification, or multiclass classification. Monitoring these metrics is crucial for detecting model drift over time. The following sections cover the key model quality metrics for each problem type, as well as how to set up automated monitoring and alerting using CloudWatch to continuously track your model's performance.

**Note**  
Standard deviation for metrics are provided only when at least 200 samples are available. Model Monitor computes standard deviation by randomly sampling 80% of the data five times, computing the metric, and taking the standard deviation for those results.

## Regression metrics
<a name="model-monitor-model-quality-metrics-regression"></a>

The following shows an example of the metrics that model quality monitor computes for a regression problem.

```
"regression_metrics" : {
    "mae" : {
      "value" : 0.3711832061068702,
      "standard_deviation" : 0.0037566388129940394
    },
    "mse" : {
      "value" : 0.3711832061068702,
      "standard_deviation" : 0.0037566388129940524
    },
    "rmse" : {
      "value" : 0.609248066149471,
      "standard_deviation" : 0.003079253267651125
    },
    "r2" : {
      "value" : -1.3766111872212665,
      "standard_deviation" : 0.022653980022771227
    }
  }
```

## Binary classification metrics
<a name="model-monitor-model-quality-metrics-binary"></a>

The following shows an example of the metrics that model quality monitor computes for a binary classification problem.

```
"binary_classification_metrics" : {
    "confusion_matrix" : {
      "0" : {
        "0" : 1,
        "1" : 2
      },
      "1" : {
        "0" : 0,
        "1" : 1
      }
    },
    "recall" : {
      "value" : 1.0,
      "standard_deviation" : "NaN"
    },
    "precision" : {
      "value" : 0.3333333333333333,
      "standard_deviation" : "NaN"
    },
    "accuracy" : {
      "value" : 0.5,
      "standard_deviation" : "NaN"
    },
    "recall_best_constant_classifier" : {
      "value" : 1.0,
      "standard_deviation" : "NaN"
    },
    "precision_best_constant_classifier" : {
      "value" : 0.25,
      "standard_deviation" : "NaN"
    },
    "accuracy_best_constant_classifier" : {
      "value" : 0.25,
      "standard_deviation" : "NaN"
    },
    "true_positive_rate" : {
      "value" : 1.0,
      "standard_deviation" : "NaN"
    },
    "true_negative_rate" : {
      "value" : 0.33333333333333337,
      "standard_deviation" : "NaN"
    },
    "false_positive_rate" : {
      "value" : 0.6666666666666666,
      "standard_deviation" : "NaN"
    },
    "false_negative_rate" : {
      "value" : 0.0,
      "standard_deviation" : "NaN"
    },
    "receiver_operating_characteristic_curve" : {
      "false_positive_rates" : [ 0.0, 0.0, 0.0, 0.0, 0.0, 1.0 ],
      "true_positive_rates" : [ 0.0, 0.25, 0.5, 0.75, 1.0, 1.0 ]
    },
    "precision_recall_curve" : {
      "precisions" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
      "recalls" : [ 0.0, 0.25, 0.5, 0.75, 1.0 ]
    },
    "auc" : {
      "value" : 1.0,
      "standard_deviation" : "NaN"
    },
    "f0_5" : {
      "value" : 0.3846153846153846,
      "standard_deviation" : "NaN"
    },
    "f1" : {
      "value" : 0.5,
      "standard_deviation" : "NaN"
    },
    "f2" : {
      "value" : 0.7142857142857143,
      "standard_deviation" : "NaN"
    },
    "f0_5_best_constant_classifier" : {
      "value" : 0.29411764705882354,
      "standard_deviation" : "NaN"
    },
    "f1_best_constant_classifier" : {
      "value" : 0.4,
      "standard_deviation" : "NaN"
    },
    "f2_best_constant_classifier" : {
      "value" : 0.625,
      "standard_deviation" : "NaN"
    }
  }
```

## Multiclass metrics
<a name="model-monitor-model-quality-metrics-multi"></a>

The following shows an example of the metrics that model quality monitor computes for a multiclass classification problem.

```
"multiclass_classification_metrics" : {
    "confusion_matrix" : {
      "0" : {
        "0" : 1180,
        "1" : 510
      },
      "1" : {
        "0" : 268,
        "1" : 138
      }
    },
    "accuracy" : {
      "value" : 0.6288167938931297,
      "standard_deviation" : 0.00375663881299405
    },
    "weighted_recall" : {
      "value" : 0.6288167938931297,
      "standard_deviation" : 0.003756638812994008
    },
    "weighted_precision" : {
      "value" : 0.6983172269629505,
      "standard_deviation" : 0.006195912915307507
    },
    "weighted_f0_5" : {
      "value" : 0.6803947317178771,
      "standard_deviation" : 0.005328406973561699
    },
    "weighted_f1" : {
      "value" : 0.6571162346664904,
      "standard_deviation" : 0.004385008075019733
    },
    "weighted_f2" : {
      "value" : 0.6384024354394601,
      "standard_deviation" : 0.003867109755267757
    },
    "accuracy_best_constant_classifier" : {
      "value" : 0.19370229007633588,
      "standard_deviation" : 0.0032049848450732355
    },
    "weighted_recall_best_constant_classifier" : {
      "value" : 0.19370229007633588,
      "standard_deviation" : 0.0032049848450732355
    },
    "weighted_precision_best_constant_classifier" : {
      "value" : 0.03752057718081697,
      "standard_deviation" : 0.001241536088657851
    },
    "weighted_f0_5_best_constant_classifier" : {
      "value" : 0.04473443104152011,
      "standard_deviation" : 0.0014460485504284792
    },
    "weighted_f1_best_constant_classifier" : {
      "value" : 0.06286421244683643,
      "standard_deviation" : 0.0019113576884608862
    },
    "weighted_f2_best_constant_classifier" : {
      "value" : 0.10570313141262414,
      "standard_deviation" : 0.002734216826748117
    }
  }
```

## Monitoring model quality metrics with CloudWatch
<a name="model-monitor-model-quality-cw"></a>

If you set the value of the `enable_cloudwatch_metrics` to `True` when you create the monitoring schedule, model quality monitoring jobs send all metrics to CloudWatch.

Model quality metrics appear in the following namespace:
+ For real-time endpoints: `aws/sagemaker/Endpoints/model-metrics`
+ For batch transform jobs: `aws/sagemaker/ModelMonitoring/model-metrics`

For a list of the metrics that are emitted, see the previous sections on this page.

You can use CloudWatch metrics to create an alarm when a specific metric doesn't meet the threshold you specify. For instructions about how to create CloudWatch alarms, see [Create a CloudWatch alarm based on a static threshold](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ConsoleAlarms.html) in the *CloudWatch User Guide*.

# Bias drift for models in production
<a name="clarify-model-monitor-bias-drift"></a>

Amazon SageMaker Clarify bias monitoring helps data scientists and ML engineers monitor predictions for bias on a regular basis. As the model is monitored, customers can view exportable reports and graphs detailing bias in SageMaker Studio and configure alerts in Amazon CloudWatch to receive notifications if bias beyond a certain threshold is detected. Bias can be introduced or exacerbated in deployed ML models when the training data differs from the data that the model sees during deployment (that is, the live data). These kinds of changes in the live data distribution might be temporary (for example, due to some short-lived, real-world events) or permanent. In either case, it might be important to detect these changes. For example, the outputs of a model for predicting home prices can become biased if the mortgage rates used to train the model differ from current, real-world mortgage rates. With bias detection capabilities in Model Monitor, when SageMaker AI detects bias beyond a certain threshold, it automatically generates metrics that you can view in SageMaker Studio and through Amazon CloudWatch alerts. 

In general, measuring bias only during the train-and-deploy phase might not be sufficient. It is possible that after the model has been deployed, the distribution of the data that the deployed model sees (that is, the live data) is different from data distribution in the training dataset. This change might introduce bias in a model over time. The change in the live data distribution might be temporary (for example, due to some short-lived behavior like the holiday season) or permanent. In either case, it might be important to detect these changes and take steps to reduce the bias when appropriate.

To detect these changes, SageMaker Clarify provides functionality to monitor the bias metrics of a deployed model continuously and raise automated alerts if the metrics exceed a threshold. For example, consider the DPPL bias metric. Specify an allowed range of values A=(amin​,amax​), for instance an interval of (-0.1, 0.1), that DPPL should belong to during deployment. Any deviation from this range should raise a *bias detected* alert. With SageMaker Clarify, you can perform these checks at regular intervals.

For example, you can set the frequency of the checks to 2 days. This means that SageMaker Clarify computes the DPPL metric on data collected during a 2-day window. In this example, Dwin​ is the data that the model processed during last 2-day window. An alert is issued if the DPPL value bwin​ computed on Dwin​ falls outside of an allowed range A. This approach to checking if bwin​ is outside of A can be somewhat noisy. Dwin​ might consist of very few samples and might not be representative of the live data distribution. The small sample size means that the value of bias bwin​ computed over Dwin​ might not be a very robust estimate. In fact, very high (or low) values of bwin​ may be observed purely due to chance. To ensure that the conclusions drawn from the observed data Dwin​ are statistically significant, SageMaker Clarify makes use of confidence intervals. Specifically, it uses the Normal Bootstrap Interval method to construct an interval C=(cmin​,cmax​) such that SageMaker Clarify is confident that the true bias value computed over the full live data is contained in C with high probability. Now, if the confidence interval C overlaps with the allowed range A, SageMaker Clarify interprets it as “it is likely that the bias metric value of the live data distribution falls within the allowed range”. If C and A are disjoint, SageMaker Clarify is confident that the bias metric does not lie in A and raises an alert.

## Model Monitor Sample Notebook
<a name="clarify-model-monitor-sample-notebooks-bias-drift"></a>

Amazon SageMaker Clarify provides the following sample notebook that shows how to capture inference data for a real-time endpoint, create a baseline to monitor evolving bias against, and inspect the results: 
+ [Monitoring bias drift and feature attribution drift Amazon SageMaker Clarify](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_model_monitor/fairness_and_explainability/SageMaker-Model-Monitor-Fairness-and-Explainability.html) – Use Amazon SageMaker Model Monitor to monitor bias drift and feature attribution drift over time.

This notebook has been verified to run in Amazon SageMaker Studio only. If you need instructions on how to open a notebook in Amazon SageMaker Studio, see [Create or Open an Amazon SageMaker Studio Classic Notebook](notebooks-create-open.md). If you're prompted to choose a kernel, choose **Python 3 (Data Science)**. The following topics contain the highlights from the last two steps, and they contain code examples from the example notebook. 

**Topics**
+ [Model Monitor Sample Notebook](#clarify-model-monitor-sample-notebooks-bias-drift)
+ [Create a Bias Drift Baseline](clarify-model-monitor-bias-drift-baseline.md)
+ [Bias Drift Violations](clarify-model-monitor-bias-drift-violations.md)
+ [Parameters to Monitor Bias Drift](clarify-config-json-monitor-bias-parameters.md)
+ [Schedule Bias Drift Monitoring Jobs](clarify-model-monitor-bias-drift-schedule.md)
+ [Inspect Reports for Data Bias Drift](clarify-model-monitor-bias-drift-report.md)
+ [CloudWatch Metrics for Bias Drift Analysis](clarify-model-monitor-bias-drift-cw.md)

# Create a Bias Drift Baseline
<a name="clarify-model-monitor-bias-drift-baseline"></a>

After you have configured your application to capture real-time or batch transform inference data, the first task to monitor for bias drift is to create a baseline. This involves configuring the data inputs, which groups are sensitive, how the predictions are captured, and the model and its post-training bias metrics. Then you need to start the baselining job.

Model bias monitor can detect bias drift of ML models on a regular basis. Similar to the other monitoring types, the standard procedure of creating a model bias monitor is first baselining and then establishing a monitoring schedule.

```
model_bias_monitor = ModelBiasMonitor(
    role=role,
    sagemaker_session=sagemaker_session,
    max_runtime_in_seconds=1800,
)
```

`DataConfig` stores information about the dataset to be analyzed (for example, the dataset file), its format (that is, CSV or JSON Lines), headers (if any) and label.

```
model_bias_baselining_job_result_uri = f"{baseline_results_uri}/model_bias"
model_bias_data_config = DataConfig(
    s3_data_input_path=validation_dataset,
    s3_output_path=model_bias_baselining_job_result_uri,
    label=label_header,
    headers=all_headers,
    dataset_type=dataset_type,
)
```

`BiasConfig` is the configuration of the sensitive groups in the dataset. Typically, bias is measured by computing a metric and comparing it across groups. The group of interest is called the *facet*. For post-training bias, you should also take the positive label into account.

```
model_bias_config = BiasConfig(
    label_values_or_threshold=[1],
    facet_name="Account Length",
    facet_values_or_threshold=[100],
)
```

`ModelPredictedLabelConfig` specifies how to extract a predicted label from the model output. In this example, the 0.8 cutoff has been chosen in anticipation that customers will turn over frequently. For more complicated outputs, there are a few more options, like "label" is the index, name, or JMESPath to locate predicted label in endpoint response payload.

```
model_predicted_label_config = ModelPredictedLabelConfig(
    probability_threshold=0.8,
)
```

`ModelConfig` is the configuration related to the model to be used for inferencing. In order to compute post-training bias metrics, the computation needs to get inferences for the model name provided. To accomplish this, the processing job uses the model to create an ephemeral endpoint (also known as *shadow endpoint*). The processing job deletes the shadow endpoint after the computations are completed. This configuration is also used by the explainability monitor.

```
model_config = ModelConfig(
    model_name=model_name,
    instance_count=endpoint_instance_count,
    instance_type=endpoint_instance_type,
    content_type=dataset_type,
    accept_type=dataset_type,
)
```

Now you can start the baselining job.

```
model_bias_monitor.suggest_baseline(
    model_config=model_config,
    data_config=model_bias_data_config,
    bias_config=model_bias_config,
    model_predicted_label_config=model_predicted_label_config,
)
print(f"ModelBiasMonitor baselining job: {model_bias_monitor.latest_baselining_job_name}")
```

The scheduled monitor automatically picks up baselining job name and waits for it before monitoring begins.

# Bias Drift Violations
<a name="clarify-model-monitor-bias-drift-violations"></a>

Bias drift jobs evaluate the baseline constraints provided by the [baseline configuration](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelBiasJobDefinition.html#sagemaker-CreateModelBiasJobDefinition-request-ModelBiasBaselineConfig) against the analysis results of current `MonitoringExecution`. If violations are detected, the job lists them to the *constraint\$1violations.json* file in the execution output location, and marks the execution status as [Interpret results](model-monitor-interpreting-results.md).

Here is the schema of the bias drift violations file.
+ `facet` – The name of the facet, provided by the monitoring job analysis configuration facet `name_or_index`. 
+ `facet_value` – The value of the facet, provided by the monitoring job analysis configuration facet `value_or_threshold`.
+ `metric_name` – The short name of the bias metric. For example, "CI" for class imbalance. See [Pre-training Bias Metrics](clarify-measure-data-bias.md) for the short names of each of the pre-training bias metrics and [Post-training Data and Model Bias Metrics](clarify-measure-post-training-bias.md) for the short names of each of the post-training bias metrics.
+ `constraint_check_type` – The type of violation monitored. Currently only `bias_drift_check` is supported.
+ `description` – A descriptive message to explain the violation.

```
{
    "version": "1.0",
    "violations": [{
        "facet": "string",
        "facet_value": "string",
        "metric_name": "string",
        "constraint_check_type": "string",
        "description": "string"
    }]
}
```

A bias metric is used to measure the level of equality in a distribution. A value close to zero indicates that the distribution is more balanced. If the value of a bias metric in the job analysis results file (analysis.json) is worse than its corresponding value in the baseline constraints file, a violation is logged. As an example, if the baseline constraint for the DPPL bias metric is `0.2`, and the analysis result is `0.1`, no violation is logged because `0.1` is closer to `0` than `0.2`. However, if the analysis result is `-0.3`, a violation is logged because it is farther from `0` than the baseline constraint of `0.2`.

```
{
    "version": "1.0",
    "violations": [{
        "facet": "Age",
        "facet_value": "40",
        "metric_name": "CI",
        "constraint_check_type": "bias_drift_check",
        "description": "Value 0.0751544567666083 does not meet the constraint requirement"
    }, {
        "facet": "Age",
        "facet_value": "40",
        "metric_name": "DPPL",
        "constraint_check_type": "bias_drift_check",
        "description": "Value -0.0791244970125596 does not meet the constraint requirement"
    }]
}
```

# Parameters to Monitor Bias Drift
<a name="clarify-config-json-monitor-bias-parameters"></a>

Amazon SageMaker Clarify bias monitoring reuses a subset of the parameters used in the analysis configuration of [Analysis Configuration Files](clarify-processing-job-configure-analysis.md). After describing the configuration parameters, this topic provides examples of JSON files. These files are used to configure CSV and JSON Lines datasets to monitor them for bias drift when machine learning models are in production.

The following parameters must be provided in a JSON file. The path to this JSON file must be provided in the `ConfigUri` parameter of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ModelBiasAppSpecification](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ModelBiasAppSpecification) API.
+ `"version"` – (Optional) Schema version of the configuration file. If not provided, the latest supported version is used.
+ `"headers"` – (Optional) A list of column names in the dataset. If the `dataset_type` is `"application/jsonlines"` and `"label"` is specified, then the last header becomes the header of the label column. 
+ `"label"` – (Optional) Target attribute for the model to be used for *bias metrics*. Specified either as a column name, or an index (if dataset format is CSV), or as a JMESPath (if dataset format is JSON Lines).
+ `"label_values_or_threshold"` – (Optional) List of label values or threshold. Indicates positive outcome used for bias metrics.
+ `"facet"` – (Optional) A list of features that are sensitive attributes, referred to as facets. Facets are used for *bias metrics* in the form of pairs, and include the following:
  + `"name_or_index"` – Facet column name or index.
  + `"value_or_threshold"` – (Optional) List of values or threshold that the facet column can take. Indicates the sensitive group, such as the group that is used to measure bias against. If not provided, bias metrics are computed as one group for every unique value (rather than all values). If the facet column is numeric, this threshold value is applied as the lower bound to select the sensitive group.
+ `"group_variable"` – (Optional) A column name or index to indicate the group variable to be used for the *bias metric* *Conditional Demographic Disparity.*

The other parameters should be provided in `EndpointInput` (for real-time endpoints) or `BatchTransformInput` (for batch transform jobs) of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ModelBiasJobInput](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ModelBiasJobInput) API.
+ `FeaturesAttribute` – This parameter is required if endpoint input data format is `"application/jsonlines"`. It is the JMESPath used to locate the feature columns if the dataset format is JSON Lines.
+ `InferenceAttribute` – Index or JMESPath location in the model output for the target attribute to be used for monitored for bias using bias metrics. If it is not provided in the CSV `accept_type` case, then it is assumed that the model output is a single numeric value corresponding to a score or probability.
+ `ProbabilityAttribute` – Index or JMESPath location in the model output for probabilities. If the model output is JSON Lines with a list of labels and probabilities, for example, then the label that corresponds to the maximum probability is selected for bias computations.
+ `ProbabilityThresholdAttribute` – (Optional) A float value to indicate the threshold to select the binary label, in the case of binary classification. The default value is 0.5.

## Example JSON Configuration Files for CSV and JSON Lines Datasets
<a name="clarify-config-json-monitor-bias-parameters-examples"></a>

Here are examples of the JSON files used to configure CSV and JSON Lines datasets to monitor them for bias drift.

**Topics**
+ [CSV Datasets](#clarify-config-json-monitor-bias-parameters-example-csv)
+ [JSON Lines Datasets](#clarify-config-json-monitor-bias-parameters-example-jsonlines)

### CSV Datasets
<a name="clarify-config-json-monitor-bias-parameters-example-csv"></a>

Consider a dataset that has four feature columns and one label column, where the first feature and the label are binary, as in the following example.

```
0, 0.5814568701544718, 0.6651538910132964, 0.3138080342665499, 0
1, 0.6711642728531724, 0.7466687034026017, 0.1215477472819713, 1
0, 0.0453256543003371, 0.6377430803264152, 0.3558625219713576, 1
1, 0.4785191813363956, 0.0265841045263860, 0.0376935084990697, 1
```

Assume that the model output has two columns, where the first one is the predicted label and the second one is the probability, as in the following example.

```
1, 0.5385257417814224
```

Then the following JSON configuration file shows an example of how this CSV dataset can be configured.

```
{
    "headers": [
        "feature_0",
        "feature_1",
        "feature_2",
        "feature_3",
        "target"
    ],
    "label": "target",
    "label_values_or_threshold": [1],
    "facet": [{
        "name_or_index": "feature_1",
        "value_or_threshold": [1]
    }]
}
```

The predicted label is selected by the `"InferenceAttribute"` parameter. Zero-based numbering is used, so 0 indicates the first column of the model output,

```
"EndpointInput": {
    ...
    "InferenceAttribute": 0
    ...
}
```

Alternatively, you can use different parameters to convert probability values to binary predicted labels. Zero-based numbering is used: 1 indicates the second column; the `ProbabilityThresholdAttribute` value of 0.6 indicates that a probability greater than 0.6 predicts the binary label as 1.

```
"EndpointInput": {
    ...
    "ProbabilityAttribute": 1,
    "ProbabilityThresholdAttribute": 0.6
    ...
}
```

### JSON Lines Datasets
<a name="clarify-config-json-monitor-bias-parameters-example-jsonlines"></a>

Consider a dataset that has four feature columns and one label column, where the first feature and the label are binary, as in the following example.

```
{"features":[0, 0.5814568701544718, 0.6651538910132964, 0.3138080342665499], "label":0}
{"features":[1, 0.6711642728531724, 0.7466687034026017, 0.1215477472819713], "label":1}
{"features":[0, 0.0453256543003371, 0.6377430803264152, 0.3558625219713576], "label":1}
{"features":[1, 0.4785191813363956, 0.0265841045263860, 0.0376935084990697], "label":1}
```

Assume that the model output has two columns, where the first is a predicted label and the second is a probability.

```
{"predicted_label":1, "probability":0.5385257417814224}
```

The following JSON configuration file shows an example of how this JSON Lines dataset can be configured.

```
{
    "headers": [
        "feature_0",
        "feature_1",
        "feature_2",
        "feature_3",
        "target"
    ],
    "label": "label",
    "label_values_or_threshold": [1],
    "facet": [{
        "name_or_index": "feature_1",
        "value_or_threshold": [1]
    }]
}
```

Then, the `"features"` parameter value in `EndpointInput` (for real-time endpoints) or `BatchTransformInput` (for batch transform jobs) is used to locate the features in the dataset, and the `"predicted_label"` parameter value selects the predicted label from the model output. 

```
"EndpointInput": {
    ...
    "FeaturesAttribute": "features",
    "InferenceAttribute": "predicted_label"
    ...
}
```

Alternatively, you can convert probability values to predicted binary labels using the `ProbabilityThresholdAttribute` parameter value. A value of 0.6, for example, indicates that a probability greater than 0.6 predicts the binary label as 1.

```
"EndpointInput": {
    ...
    "FeaturesAttribute": "features",
    "ProbabilityAttribute": "probability",
    "ProbabilityThresholdAttribute": 0.6
    ...
}
```

# Schedule Bias Drift Monitoring Jobs
<a name="clarify-model-monitor-bias-drift-schedule"></a>

After you create your baseline, you can call the `create_monitoring_schedule()` method of your `ModelBiasModelMonitor` class instance to schedule an hourly bias drift monitor. The following sections show you how to create bias drift monitor for a model deployed to a real-time endpoint as well as for a batch transform job.

**Important**  
You can specify either a batch transform input or an endpoint input, but not both, when you create your monitoring schedule.

Unlike data quality monitoring, you need to supply Ground Truth labels if you want to monitor model quality. However, Ground Truth labels could be delayed. To address this, specify offsets when you create your monitoring schedule. For details about how to create time offsets, see [Model monitor offsets](model-monitor-model-quality-schedule.md#model-monitor-model-quality-schedule-offsets). 

If you have submitted a baselining job, the monitor automatically picks up analysis configuration from the baselining job. If you skip the baselining step or the capture dataset has a different nature from the training dataset, you must provide the analysis configuration.

## Bias drift monitoring for models deployed to real-time endpoint
<a name="model-monitor-bias-quality-rt"></a>

To schedule a bias drift monitor for a real-time endpoint, pass your `EndpointInput` instance to the `endpoint_input` argument of your `ModelBiasModelMonitor` instance, as shown in the following code sample:

```
from sagemaker.model_monitor import CronExpressionGenerator
            
model_bias_monitor = ModelBiasModelMonitor(
    role=sagemaker.get_execution_role(),
    ...
)

model_bias_analysis_config = None
if not model_bias_monitor.latest_baselining_job:
    model_bias_analysis_config = BiasAnalysisConfig(
        model_bias_config,
        headers=all_headers,
        label=label_header,
    )

model_bias_monitor.create_monitoring_schedule(
    monitor_schedule_name=schedule_name,
    post_analytics_processor_script=s3_code_postprocessor_uri,
    output_s3_uri=s3_report_path,
    statistics=model_bias_monitor.baseline_statistics(),
    constraints=model_bias_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
    analysis_config=model_bias_analysis_config,
    endpoint_input=EndpointInput(
        endpoint_name=endpoint_name,
        destination="/opt/ml/processing/input/endpoint",
        start_time_offset="-PT1H",
        end_time_offset="-PT0H",
        probability_threshold_attribute=0.8,
    ),
)
```

## Bias drift monitoring for batch transform jobs
<a name="model-monitor-bias-quality-bt"></a>

To schedule a bias drift monitor for a batch transform job, pass your `BatchTransformInput` instance to the `batch_transform_input` argument of your `ModelBiasModelMonitor` instance, as shown in the following code sample:

```
from sagemaker.model_monitor import CronExpressionGenerator
                
model_bias_monitor = ModelBiasModelMonitor(
    role=sagemaker.get_execution_role(),
    ...
)

model_bias_analysis_config = None
if not model_bias_monitor.latest_baselining_job:
    model_bias_analysis_config = BiasAnalysisConfig(
        model_bias_config,
        headers=all_headers,
        label=label_header,
    )
    
schedule = model_bias_monitor.create_monitoring_schedule(
   monitor_schedule_name=schedule_name,
   post_analytics_processor_script=s3_code_postprocessor_uri,
   output_s3_uri=s3_report_path,
   statistics=model_bias_monitor.baseline_statistics(),
   constraints=model_bias_monitor.suggested_constraints(),
   schedule_cron_expression=CronExpressionGenerator.hourly(),
   enable_cloudwatch_metrics=True,
   analysis_config=model_bias_analysis_config,
   batch_transform_input=BatchTransformInput(
        destination="opt/ml/processing/input",
        data_captured_destination_s3_uri=s3_capture_path,
        start_time_offset="-PT1H",
        end_time_offset="-PT0H",
        probability_threshold_attribute=0.8
   ),
)
```

# Inspect Reports for Data Bias Drift
<a name="clarify-model-monitor-bias-drift-report"></a>

If you are not able to inspect the results of the monitoring in the generated reports in SageMaker Studio, you can print them out as follows:

```
schedule_desc = model_bias_monitor.describe_schedule()
execution_summary = schedule_desc.get("LastMonitoringExecutionSummary")
if execution_summary and execution_summary["MonitoringExecutionStatus"] in ["Completed", "CompletedWithViolations"]:
    last_model_bias_monitor_execution = model_bias_monitor.list_executions()[-1]
    last_model_bias_monitor_execution_report_uri = last_model_bias_monitor_execution.output.destination
    print(f'Report URI: {last_model_bias_monitor_execution_report_uri}')
    last_model_bias_monitor_execution_report_files = sorted(S3Downloader.list(last_model_bias_monitor_execution_report_uri))
    print("Found Report Files:")
    print("\n ".join(last_model_bias_monitor_execution_report_files))
else:
    last_model_bias_monitor_execution = None
    print("====STOP==== \n No completed executions to inspect further. Please wait till an execution completes or investigate previously reported failures.")
```

 If there are violations compared to the baseline, they are listed here:

```
if last_model_bias_monitor_execution:
    model_bias_violations = last_model_bias_monitor_execution.constraint_violations()
    if model_bias_violations:
        print(model_bias_violations.body_dict)
```

If your model is deployed to a real-time endpoint, you can see visualizations in SageMaker AI Studio of the analysis results and CloudWatch metrics by choosing the **Endpoints** tab, and then double-clicking the endpoint.

# CloudWatch Metrics for Bias Drift Analysis
<a name="clarify-model-monitor-bias-drift-cw"></a>

This guide shows CloudWatch metrics and their properties that you can use for bias drift analysis in SageMaker Clarify. Bias drift monitoring jobs compute both [pre-training bias metrics](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-measure-data-bias.html) and [post-training bias metrics](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-measure-post-training-bias.html), and publish them to the following CloudWatch namespace:
+ For real-time endpoints: `aws/sagemaker/Endpoints/bias-metrics`
+ For batch transform jobs: `aws/sagemaker/ModelMonitoring/bias-metrics` 

The CloudWatch metric name appends the metric's short name to `bias_metric`.

For example, `bias_metric_CI` is the bias metric for class imbalance (CI).

**Note**  
`+/- infinity` is published as the floating point number `+/- 2.348543e108`, and errors including null values are not published.

Each metric has the following properties:
+ `Endpoint`: The name of the monitored endpoint, if applicable.
+ `MonitoringSchedule`The name of the schedule for the monitoring job. 
+ `BiasStage`: The name of the stage of the bias drift monitoring job. Choose either `Pre-training` or `Post-Training`.
+ `Label`: The name of the target feature, provided by the monitoring job analysis configuration `label`.
+ `LabelValue`: The value of the target feature, provided by the monitoring job analysis configuration `label_values_or_threshold`.
+ `Facet`: The name of the facet, provided by the monitoring job analysis configuration facet `name_of_index`.
+ `FacetValue`: The value of the facet, provided by the monitoring job analysis configuration facet `nvalue_or_threshold`.

To stop the monitoring jobs from publishing metrics, set `publish_cloudwatch_metrics` to `Disabled` in the `Environment` map of [model bias job](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelBiasJobDefinition.html) definition.

# Feature attribution drift for models in production
<a name="clarify-model-monitor-feature-attribution-drift"></a>

A drift in the distribution of live data for models in production can result in a corresponding drift in the feature attribution values, just as it could cause a drift in bias when monitoring bias metrics. Amazon SageMaker Clarify feature attribution monitoring helps data scientists and ML engineers monitor predictions for feature attribution drift on a regular basis. As the model is monitored, customers can view exportable reports and graphs detailing feature attributions in SageMaker Studio and configure alerts in Amazon CloudWatch to receive notifications if it is detected that the attribution values drift beyond a certain threshold. 

To illustrate this with a specific situation, consider a hypothetical scenario for college admissions. Assume that we observe the following (aggregated) feature attribution values in the training data and in the live data:

College Admission Hypothetical Scenario


| Feature | Attribution in training data | Attribution in live data | 
| --- | --- | --- | 
| SAT score | 0.70 | 0.10 | 
| GPA | 0.50 | 0.20 | 
| Class rank | 0.05 | 0.70 | 

The change from training data to live data appears significant. The feature ranking has completely reversed. Similar to the bias drift, the feature attribution drifts might be caused by a change in the live data distribution and warrant a closer look into the model behavior on the live data. Again, the first step in these scenarios is to raise an alarm that a drift has happened.

We can detect the drift by comparing how the ranking of the individual features changed from training data to live data. In addition to being sensitive to changes in ranking order, we also want to be sensitive to the raw attribution score of the features. For instance, given two features that fall in the ranking by the same number of positions going from training to live data, we want to be more sensitive to the feature that had a higher attribution score in the training data. With these properties in mind, we use the Normalized Discounted Cumulative Gain (NDCG) score for comparing the feature attributions rankings of training and live data.

Specifically, assume we have the following:
+ *F=[f1​,…,fm​] * is the list of features sorted with respect to their attribution scores in the training data where *m* is the total number of features. For instance, in our case, *F*=[SAT Score, GPA, Class Rank].
+ *a(f)* is a function that returns the feature attribution score on the training data given a feature *f*. For example, *a*(SAT Score) = 0.70.
+ *F′=[f′​1​, …, f′​m​] *is the list of features sorted with respect to their attribution scores in the live data. For example, *F*′= [Class Rank, GPA, SAT Score].

Then, we can compute the NDCG as:

        NDCG=DCG/iDCG​

with 
+ DCG = ∑1m*a*(*f'i*)/log2​(*i*\$11)
+ iDCG = ∑1m*a*(*fi*)/log2​(*i*\$11)

The quantity DCG measures whether features with high attribution in the training data are also ranked higher in the feature attribution computed on the live data. The quantity iDCG measures the *ideal score* and it's just a normalizing factor to ensure that the final quantity resides in the range [0, 1], with 1 being the best possible value. A NDCG value of 1 means that the feature attribution ranking in the live data is the same as the one in the training data. In this particular example, because the ranking changed by quite a bit, the NDCG value is 0.69.

In SageMaker Clarify, if the NDCG value is below 0.90, we automatically raise an alert.

## Model Monitor Example Notebook
<a name="clarify-model-monitor-sample-notebooks-feature-drift"></a>

SageMaker Clarify provides the following example notebook that shows how to capture inference data for a real-time endpoint, create a baseline to monitor evolving bias against, and inspect the results: 
+ [Monitoring bias drift and feature attribution drift Amazon SageMaker Clarify](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_model_monitor/fairness_and_explainability/SageMaker-Model-Monitor-Fairness-and-Explainability.html) – Use Amazon SageMaker Model Monitor to monitor bias drift and feature attribution drift over time.

This notebook has been verified to run in SageMaker Studio only. If you need instructions on how to open a notebook in SageMaker Studio, see [Create or Open an Amazon SageMaker Studio Classic Notebook](notebooks-create-open.md). If you're prompted to choose a kernel, choose **Python 3 (Data Science)**. The following topics contain the highlights from the last two steps, and they contain code examples from the example notebook. 

**Topics**
+ [Model Monitor Example Notebook](#clarify-model-monitor-sample-notebooks-feature-drift)
+ [Create a SHAP Baseline for Models in Production](clarify-model-monitor-shap-baseline.md)
+ [Model Feature Attribution Drift Violations](clarify-model-monitor-model-attribution-drift-violations.md)
+ [Parameters to Monitor Attribution Drift](clarify-config-json-monitor-model-explainability-parameters.md)
+ [Schedule Feature Attribute Drift Monitoring Jobs](clarify-model-monitor-feature-attribute-drift-schedule.md)
+ [Inspect Reports for Feature Attribute Drift in Production Models](clarify-feature-attribute-drift-report.md)
+ [CloudWatch Metrics for Feature Drift Analysis](clarify-feature-attribute-drift-cw.md)

# Create a SHAP Baseline for Models in Production
<a name="clarify-model-monitor-shap-baseline"></a>

Explanations are typically contrastive, that is, they account for deviations from a baseline. For information on explainability baselines, see [SHAP Baselines for Explainability](clarify-feature-attribute-shap-baselines.md).

In addition to providing explanations for per-instance inferences, SageMaker Clarify also supports global explanation for ML models that helps you understand the behavior of a model as a whole in terms of its features. SageMaker Clarify generates a global explanation of an ML model by aggregating the Shapley values over multiple instances. SageMaker Clarify supports the following different ways of aggregation, which you can use to define baselines:
+ `mean_abs` – Mean of absolute SHAP values for all instances.
+ `median` – Median of SHAP values for all instances.
+ `mean_sq` – Mean of squared SHAP values for all instances.

After you have configured your application to capture real-time or batch transform inference data, the first task to monitor for drift in feature attribution is to create a baseline to compare against. This involves configuring the data inputs, which groups are sensitive, how the predictions are captured, and the model and its posttraining bias metrics. Then you need to start the baselining job. Model explainability monitor can explain the predictions of a deployed model that's producing inferences and detect feature attribution drift on a regular basis.

```
model_explainability_monitor = ModelExplainabilityMonitor(
    role=role,
    sagemaker_session=sagemaker_session,
    max_runtime_in_seconds=1800,
)
```

In this example, the explainability baselining job shares the test dataset with the bias baselining job, so it uses the same `DataConfig`, and the only difference is the job output URI.

```
model_explainability_baselining_job_result_uri = f"{baseline_results_uri}/model_explainability"
model_explainability_data_config = DataConfig(
    s3_data_input_path=validation_dataset,
    s3_output_path=model_explainability_baselining_job_result_uri,
    label=label_header,
    headers=all_headers,
    dataset_type=dataset_type,
)
```

Currently the SageMaker Clarify explainer offers a scalable and efficient implementation of SHAP, so the explainability config is SHAPConfig, including the following:
+ `baseline` – A list of rows (at least one) or S3 object URI to be used as the baseline dataset in the Kernel SHAP algorithm. The format should be the same as the dataset format. Each row should contain only the feature columns/values and omit the label column/values.
+ `num_samples` – Number of samples to be used in the Kernel SHAP algorithm. This number determines the size of the generated synthetic dataset to compute the SHAP values.
+ agg\$1method – Aggregation method for global SHAP values. Following are valid values:
  + `mean_abs` – Mean of absolute SHAP values for all instances.
  + `median` – Median of SHAP values for all instances.
  + `mean_sq` – Mean of squared SHAP values for all instances.
+ `use_logit` – Indicator of whether the logit function is to be applied to the model predictions. Default is `False`. If `use_logit` is `True`, the SHAP values will have log-odds units.
+ `save_local_shap_values` (bool) – Indicator of whether to save the local SHAP values in the output location. Default is `False`.

```
# Here use the mean value of test dataset as SHAP baseline
test_dataframe = pd.read_csv(test_dataset, header=None)
shap_baseline = [list(test_dataframe.mean())]

shap_config = SHAPConfig(
    baseline=shap_baseline,
    num_samples=100,
    agg_method="mean_abs",
    save_local_shap_values=False,
)
```

Start a baselining job. The same `model_config` is required because the explainability baselining job needs to create a shadow endpoint to get predictions for the generated synthetic dataset.

```
model_explainability_monitor.suggest_baseline(
    data_config=model_explainability_data_config,
    model_config=model_config,
    explainability_config=shap_config,
)
print(f"ModelExplainabilityMonitor baselining job: {model_explainability_monitor.latest_baselining_job_name}")
```

# Model Feature Attribution Drift Violations
<a name="clarify-model-monitor-model-attribution-drift-violations"></a>

Feature attribution drift jobs evaluate the baseline constraints provided by the [baseline configuration](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelExplainabilityJobDefinition.html#sagemaker-CreateModelExplainabilityJobDefinition-request-ModelExplainabilityBaselineConfig) against the analysis results of current `MonitoringExecution`. If violations are detected, the job lists them to the *constraint\$1violations.json* file in the execution output location, and marks the execution status as [Interpret results](model-monitor-interpreting-results.md).

Here is the schema of the feature attribution drift violations file.
+ `label` – The name of the label, job analysis configuration `label_headers` or a placeholder such as `"label0"`.
+ `metric_name` – The name of the explainability analysis method. Currently only `shap` is supported.
+ `constraint_check_type` – The type of violation monitored. Currently only `feature_attribution_drift_check` is supported.
+ `description` – A descriptive message to explain the violation.

```
{
    "version": "1.0",
    "violations": [{
        "label": "string",
        "metric_name": "string",
        "constraint_check_type": "string",
        "description": "string"
    }]
}
```

For each label in the `explanations` section, the monitoring jobs calculate the [nDCG score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.ndcg_score.html) of its global SHAP values in the baseline constraints file and in the job analysis results file (*analysis.json*). If the score is less than 0.9, then a violation is logged. The combined global SHAP value is evaluated, so there are no `“feature”` fields in the violation entry. The following output provides an example of several logged violations.

```
{
    "version": "1.0",
    "violations": [{
        "label": "label0",
        "metric_name": "shap",
        "constraint_check_type": "feature_attribution_drift_check",
        "description": "Feature attribution drift 0.7639720923277322 exceeds threshold 0.9"
    }, {
        "label": "label1",
        "metric_name": "shap",
        "constraint_check_type": "feature_attribution_drift_check",
        "description": "Feature attribution drift 0.7323763972092327 exceeds threshold 0.9"
    }]
}
```

# Parameters to Monitor Attribution Drift
<a name="clarify-config-json-monitor-model-explainability-parameters"></a>

Amazon SageMaker Clarify explainability monitor reuses a subset of the parameters used in the analysis configuration of [Analysis Configuration Files](clarify-processing-job-configure-analysis.md). The following parameters must be provided in a JSON file and the path must be provided in the `ConfigUri` parameter of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ModelExplainabilityAppSpecification](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ModelExplainabilityAppSpecification).
+ `"version"` – (Optional) Schema version of the configuration file. If not provided, the latest supported version is used.
+ `"headers"` – (Optional) A list of feature names in the dataset. Explainability analysis does not require labels. 
+ `"methods"` – A list of methods and their parameters for the analyses and reports. If any section is omitted, then it is not computed.
  + `"shap"` – (Optional) Section on SHAP value computation.
    + `"baseline"` – (Optional) A list of rows (at least one), or an Amazon Simple Storage Service Amazon S3 object URI. To be used as the baseline dataset (also known as a background dataset) in the Kernel SHAP algorithm. The format should be the same as the dataset format. Each row should contain only the feature columns (or values). Before you send each row to the model, omit any column that must be excluded.
    + `"num_samples"` – Number of samples to be used in the Kernel SHAP algorithm. This number determines the size of the generated synthetic dataset to compute the SHAP values. If not provided, then a SageMaker Clarify job chooses the value based on a count of features.
    + `"agg_method"` – Aggregation method for global SHAP values. Valid values are as follows:
      + `"mean_abs"` – Mean of absolute SHAP values for all instances.
      + `"median"` – Median of SHAP values for all instances.
      + `"mean_sq"` – Mean of squared SHAP values for all instances.
    + `"use_logit"` – (Optional) Boolean value to indicate if the logit function is to be applied to the model predictions. If `"use_logit"` is `true`, then the SHAP values have log-odds units. The default value is `false`. 
    + `"save_local_shap_values"` – (Optional) Boolean value to indicate if local SHAP values are to be saved in the output location. Use `true` to save them. Use `false` to not save them. The default is `false`.
+ `"predictor"` – (Optional for real-time endpoint, required for batch transform) Section on model parameters, required if `"shap"` and `"post_training_bias"` sections are present.
  + `"model_name"` – Model name created by `CreateModel` API, with container mode as `SingleModel`.
  + `"instance_type"` – Instance type for the shadow endpoint.
  + `"initial_instance_count"` – Instance count for the shadow endpoint.
  + `"content_type"` – (Optional) The model input format to be used for getting inferences with the shadow endpoint. Valid values are `"text/csv"` for CSV, `"application/jsonlines"` for JSON Lines, `application/x-parquet` for Apache Parquet, and `application/x-image` to enable Computer Vision explainability. The default value is the same as the `dataset_type` format.
  + `"accept_type"` – (Optional) The model *output* format to be used for getting inferences with the shadow endpoint. Valid values are `"text/csv"` for CSV, `"application/jsonlines"` for JSON Lines. If omitted, SageMaker Clarify uses the response data type of the captured data.
  + `"content_template"` – (Optional) A template string used to construct the model input from dataset instances. It is only used when `"content_type"` is `"application/jsonlines"`. The template should have only one placeholder, `$features`, which is replaced by the features list at runtime. For example, given `"content_template":"{\"myfeatures\":$features}"`, if an instance (no label) is `1,2,3`, then model input becomes JSON Lines `'{"myfeatures":[1,2,3]}'`.
  + `"label_headers"` – (Optional) A list of values that the `"label"` takes in the dataset. Associates the scores returned by the model endpoint or batch transform job with their corresponding label values. If it is provided, then the analysis report uses the headers instead of placeholders like `“label0”`.

The other parameters should be provided in `EndpointInput` (for real-time endpoints) or `BatchTransformInput` (for batch transform jobs) of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ModelExplainabilityJobInput](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ModelExplainabilityJobInput) API.
+ `FeaturesAttribute` – This parameter is required if endpoint or batch job input data format is `"application/jsonlines"`. It is the JMESPath used to locate the feature columns if the dataset format is JSON Lines.
+ `ProbabilityAttribute` – Index or JMESPath location in the model output for probabilities. If the model output is JSON Lines with a list of labels and probabilities, for example, then the label that corresponds to the maximum probability is selected for bias computations.

## Example JSON Configuration Files for CSV and JSON Lines Datasets
<a name="clarify-config-json-monitor-model-explainability-parameters-examples"></a>

Here are examples of the JSON files used to configure CSV and JSON Lines datasets to monitor them for feature attribution drift.

**Topics**
+ [CSV Datasets](#clarify-config-json-monitor-model-explainability-parameters-example-csv)
+ [JSON Lines Datasets](#clarify-config-json-monitor-model-explainability-parameters-example-jsonlines)

### CSV Datasets
<a name="clarify-config-json-monitor-model-explainability-parameters-example-csv"></a>

Consider a dataset that has three numerical feature columns, as in the following example.

```
0.5814568701544718, 0.6651538910132964, 0.3138080342665499
0.6711642728531724, 0.7466687034026017, 0.1215477472819713
0.0453256543003371, 0.6377430803264152, 0.3558625219713576
0.4785191813363956, 0.0265841045263860, 0.0376935084990697
```

Assume that the model output has two columns, where the first one is the predicted label and the second one is the probability, as in the following example.

```
1, 0.5385257417814224
```

The following example JSON configuration file shows how this CSV dataset can be configured.

```
{
                    
    "headers": [
        "feature_1",
        "feature_2",
        "feature_3"
    ],
    "methods": {
        "shap": {
            "baseline": [
                [0.4441164946610942, 0.5190374448171748, 0.20722795300473712]
            ],
            "num_samples": 100,
            "agg_method": "mean_abs"
        }
    },
    "predictor": {
        "model_name": "my_model",
        "instance_type": "ml.m5.xlarge",
        "initial_instance_count": 1
    }
}
```

The predicted label is selected by the `"ProbabilityAttribute"` parameter. Zero-based numbering is used, so 1 indicates the second column of the model output.

```
"EndpointInput": {
    ...
    "ProbabilityAttribute": 1
    ...
}
```

### JSON Lines Datasets
<a name="clarify-config-json-monitor-model-explainability-parameters-example-jsonlines"></a>

Consider a dataset that has four feature columns and one label column, where the first feature and the label are binary, as in the following example.

```
{"features":[0, 0.5814568701544718, 0.6651538910132964, 0.3138080342665499], "label":0}
{"features":[1, 0.6711642728531724, 0.7466687034026017, 0.1215477472819713], "label":1}
{"features":[0, 0.0453256543003371, 0.6377430803264152, 0.3558625219713576], "label":1}
{"features":[1, 0.4785191813363956, 0.0265841045263860, 0.0376935084990697], "label":1}
```

The model input is the same as the dataset format, and the model output are JSON Lines, as in the following example.

```
{"predicted_label":1, "probability":0.5385257417814224}
```

In the following example, the JSON configuration file shows how this JSON Lines dataset can be configured.

```
{
    "headers": [
        "feature_1",
        "feature_2",
        "feature_3"
    ],
    "methods": {
        "shap": {
            "baseline": [
                {"features":[0.4441164946610942, 0.5190374448171748, 0.20722795300473712]}
            ],
            "num_samples": 100,
            "agg_method": "mean_abs"
        }
    },
    "predictor": {
        "model_name": "my_model",
        "instance_type": "ml.m5.xlarge",
        "initial_instance_count": 1,
        "content_template":"{\"features\":$features}"
    }
}
```

Then the `"features"` parameter value in `EndpointInput` (for real-time endpoints) or `BatchTransformInput` (for batch transform jobs) is used to locate the features in the dataset, and the `"probability"` parameter value selects the probability value from model output.

```
"EndpointInput": {
    ...
    "FeaturesAttribute": "features",
    "ProbabilityAttribute": "probability",
    ...
}
```

# Schedule Feature Attribute Drift Monitoring Jobs
<a name="clarify-model-monitor-feature-attribute-drift-schedule"></a>

After you create your SHAP baseline, you can call the `create_monitoring_schedule()` method of your `ModelExplainabilityMonitor` class instance to schedule an hourly model explainability monitor. The following sections show you how to create a model explainability monitor for a model deployed to a real-time endpoint as well as for a batch transform job.

**Important**  
You can specify either a batch transform input or an endpoint input, but not both, when you create your monitoring schedule.

If a baselining job has been submitted, the monitor automatically picks up analysis configuration from the baselining job. However, if you skip the baselining step or the capture dataset has a different nature from the training dataset, you have to provide the analysis configuration. `ModelConfig` is required by `ExplainabilityAnalysisConfig` for the same reason that it's required for the baselining job. Note that only features are required for computing feature attribution, so you should exclude Ground Truth labeling.

## Feature attribution drift monitoring for models deployed to real-time endpoint
<a name="model-monitor-explain-quality-rt"></a>

To schedule a model explainability monitor for a real-time endpoint, pass your `EndpointInput` instance to the `endpoint_input` argument of your `ModelExplainabilityMonitor` instance, as shown in the following code sample:

```
from sagemaker.model_monitor import CronExpressionGenerator

model_exp_model_monitor = ModelExplainabilityMonitor(
   role=sagemaker.get_execution_role(),
   ... 
)

schedule = model_exp_model_monitor.create_monitoring_schedule(
   monitor_schedule_name=schedule_name,
   post_analytics_processor_script=s3_code_postprocessor_uri,
   output_s3_uri=s3_report_path,
   statistics=model_exp_model_monitor.baseline_statistics(),
   constraints=model_exp_model_monitor.suggested_constraints(),
   schedule_cron_expression=CronExpressionGenerator.hourly(),
   enable_cloudwatch_metrics=True,
   endpoint_input=EndpointInput(
        endpoint_name=endpoint_name,
        destination="/opt/ml/processing/input/endpoint",
    )
)
```

## Feature attribution drift monitoring for batch transform jobs
<a name="model-monitor-explain-quality-bt"></a>

To schedule a model explainability monitor for a batch transform job, pass your `BatchTransformInput` instance to the `batch_transform_input` argument of your `ModelExplainabilityMonitor` instance, as shown in the following code sample:

```
from sagemaker.model_monitor import CronExpressionGenerator

model_exp_model_monitor = ModelExplainabilityMonitor(
   role=sagemaker.get_execution_role(),
   ... 
)

schedule = model_exp_model_monitor.create_monitoring_schedule(
   monitor_schedule_name=schedule_name,
   post_analytics_processor_script=s3_code_postprocessor_uri,
   output_s3_uri=s3_report_path,
   statistics=model_exp_model_monitor.baseline_statistics(),
   constraints=model_exp_model_monitor.suggested_constraints(),
   schedule_cron_expression=CronExpressionGenerator.hourly(),
   enable_cloudwatch_metrics=True,
   batch_transform_input=BatchTransformInput(
        destination="opt/ml/processing/data",
        model_name="batch-fraud-detection-model",
        input_manifests_s3_uri="s3://amzn-s3-demo-bucket/batch-fraud-detection/on-schedule-monitoring/in/",
        excludeFeatures="0",
   )
)
```

# Inspect Reports for Feature Attribute Drift in Production Models
<a name="clarify-feature-attribute-drift-report"></a>

After the schedule that you set up is started by default, you need to wait for the its first execution to start, and then stop the schedule to avoid incurring charges.

To inspect the reports, use the following code:

```
schedule_desc = model_explainability_monitor.describe_schedule()
execution_summary = schedule_desc.get("LastMonitoringExecutionSummary")
if execution_summary and execution_summary["MonitoringExecutionStatus"] in ["Completed", "CompletedWithViolations"]:
    last_model_explainability_monitor_execution = model_explainability_monitor.list_executions()[-1]
    last_model_explainability_monitor_execution_report_uri = last_model_explainability_monitor_execution.output.destination
    print(f'Report URI: {last_model_explainability_monitor_execution_report_uri}')
    last_model_explainability_monitor_execution_report_files = sorted(S3Downloader.list(last_model_explainability_monitor_execution_report_uri))
    print("Found Report Files:")
    print("\n ".join(last_model_explainability_monitor_execution_report_files))
else:
    last_model_explainability_monitor_execution = None
    print("====STOP==== \n No completed executions to inspect further. Please wait till an execution completes or investigate previously reported failures.")
```

 If there are any violations compared to the baseline, they are listed here:

```
if last_model_explainability_monitor_execution:
    model_explainability_violations = last_model_explainability_monitor_execution.constraint_violations()
    if model_explainability_violations:
        print(model_explainability_violations.body_dict)
```

If your model is deployed to a real-time endpoint, you can see visualizations in SageMaker Studio of the analysis results and CloudWatch metrics by choosing the **Endpoints** tab, and then double-clicking the endpoint.

# CloudWatch Metrics for Feature Drift Analysis
<a name="clarify-feature-attribute-drift-cw"></a>

This guide shows CloudWatch metrics and their properties that you can use for feature attribute drift analysis in SageMaker Clarify. Feature attribute drift monitoring jobs compute and publish two types of metrics:
+ The global SHAP value of each feature.
**Note**  
The name of this metric appends the feature name provided by the job analysis configuration to `feature_`. For example, `feature_X` is the global SHAP value for feature `X`.
+ The `ExpectedValue` of the metric.

These metrics are published to the following CloudWatch namespace:
+ For real-time endpoints: `aws/sagemaker/Endpoints/explainability-metrics`
+ For batch transform jobs: `aws/sagemaker/ModelMonitoring/explainability-metrics`

Each metric has the following properties:
+ `Endpoint`: The name of the monitored endpoint, if applicable.
+ `MonitoringSchedule`: The name of the schedule for the monitoring job. 
+ `ExplainabilityMethod`: The method used to compute Shapley values. Choose `KernelShap`.
+ `Label`: The name provided by job analysis configuration `label_headers`, or a placeholder like `label0`.
+ `ValueType`: The type of the value returned by the metric. Choose either `GlobalShapValues` or `ExpectedValue`.

To stop the monitoring jobs from publishing metrics, set `publish_cloudwatch_metrics` to `Disabled` in the `Environment` map of [model explainability job](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelExplainabilityJobDefinition.html) definition.

# Schedule monitoring jobs
<a name="model-monitor-scheduling"></a>

Amazon SageMaker Model Monitor provides you the ability to monitor the data collected from your real-time endpoints. You can monitor your data on a recurring schedule, or you can monitor it one time, immediately. You can create a monitoring schedule with the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateMonitoringSchedule.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateMonitoringSchedule.html) API.

With a monitoring schedule, SageMaker AI can start processing jobs to analyze the data collected during a given period. In the processing job, SageMaker AI compares the dataset for the current analysis with the baseline statistics and constraints that you provide. Then, SageMaker AI generate a violations report. In addition, CloudWatch metrics are emitted for each feature under analysis.

SageMaker AI provides a prebuilt container for performing analysis on tabular datasets. Alternatively, you could choose to bring your own container as outlined in the [Support for Your Own Containers With Amazon SageMaker Model Monitor](model-monitor-byoc-containers.md) topic.

You can create a model monitoring schedule for your real-time endpoint or batch transform job. Use the baseline resources (constraints and statistics) to compare against the real-time traffic or batch job inputs. 

**Example baseline assignments**  
In the following example, the training dataset used to train the model was uploaded to Amazon S3. If you already have it in Amazon S3, you can point to it directly.  

```
# copy over the training dataset to Amazon S3 (if you already have it in Amazon S3, you could reuse it)
baseline_prefix = prefix + '/baselining'
baseline_data_prefix = baseline_prefix + '/data'
baseline_results_prefix = baseline_prefix + '/results'

baseline_data_uri = 's3://{}/{}'.format(bucket,baseline_data_prefix)
baseline_results_uri = 's3://{}/{}'.format(bucket, baseline_results_prefix)
print('Baseline data uri: {}'.format(baseline_data_uri))
print('Baseline results uri: {}'.format(baseline_results_uri))
```

```
training_data_file = open("test_data/training-dataset-with-header.csv", 'rb')
s3_key = os.path.join(baseline_prefix, 'data', 'training-dataset-with-header.csv')
boto3.Session().resource('s3').Bucket(bucket).Object(s3_key).upload_fileobj(training_data_file)
```

**Example schedule for recurring analysis**  
If you are scheduling a model monitor for a real-time endpoint, use the baseline constraints and statistics to compare against real-time traffic. The following code snippet shows the general format you use to schedule a model monitor for a real-time endpoint. This example schedules the model monitor to run hourly.  

```
from sagemaker.model_monitor import CronExpressionGenerator
from time import gmtime, strftime

mon_schedule_name = 'my-model-monitor-schedule-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
my_default_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    endpoint_input=EndpointInput(
        endpoint_name=endpoint_name,
        destination="/opt/ml/processing/input/endpoint"
    ),
    post_analytics_processor_script=s3_code_postprocessor_uri,
    output_s3_uri=s3_report_path,
    statistics=my_default_monitor.baseline_statistics(),
    constraints=my_default_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)
```

**Example schedule for one-time analysis**  
You can also schedule the analysis to run once without recurring by passing arguments like the following to the `create_monitoring_schedule` method:  

```
    schedule_cron_expression=CronExpressionGenerator.now(),
    data_analysis_start_time="-PT1H",
    data_analysis_end_time="-PT0H",
```
In these arguments, the `schedule_cron_expression` parameter schedules the analysis to run once, immediately, with the value `CronExpressionGenerator.now()`. For any schedule with this setting, the `data_analysis_start_time` and `data_analysis_end_time` parameters are required. These parameters set the start time and end time of an analysis window. Define these times as offsets that are relative to the current time, and use ISO 8601 duration format. In this example, the times `-PT1H` and `-PT0H` define a window between one hour in the past and the current time. With this schedule, the analysis evaluates only the data that was collected during the specified window.

**Example schedule for a batch transform job**  
The following code snippet shows the general format you use to schedule a model monitor for a batch transform job.  

```
from sagemaker.model_monitor import (
    CronExpressionGenerator,
    BatchTransformInput, 
    MonitoringDatasetFormat, 
)
from time import gmtime, strftime

mon_schedule_name = 'my-model-monitor-schedule-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
my_default_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    batch_transform_input=BatchTransformInput(
        destination="opt/ml/processing/input",
        data_captured_destination_s3_uri=s3_capture_upload_path,
        dataset_format=MonitoringDatasetFormat.csv(header=False),
    ),
    post_analytics_processor_script=s3_code_postprocessor_uri,
    output_s3_uri=s3_report_path,
    statistics=my_default_monitor.baseline_statistics(),
    constraints=my_default_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)
```

```
desc_schedule_result = my_default_monitor.describe_schedule()
print('Schedule status: {}'.format(desc_schedule_result['MonitoringScheduleStatus']))
```

# The cron expression for monitoring schedule
<a name="model-monitor-schedule-expression"></a>

To provide details for the monitoring schedule, use [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ScheduleConfig.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ScheduleConfig.html), which is a `cron` expression that describes details about the monitoring schedule.

Amazon SageMaker Model Monitor supports the following `cron` expressions:
+ To set the job to start every hour, use the following:

  `Hourly: cron(0 * ? * * *)`
+ To run the job daily, use the following:

  `cron(0 [00-23] ? * * *)`
+ The run the job one time, immediately, use the following keyword:

  `NOW`

For example, the following are valid `cron` expressions:
+ Daily at 12 PM UTC: `cron(0 12 ? * * *)`
+ Daily at 12 AM UTC: `cron(0 0 ? * * *)`

To support running every 6, 12 hours, Model Monitor supports the following expression:

`cron(0 [00-23]/[01-24] ? * * *)`

For example, the following are valid `cron` expressions:
+ Every 12 hours, starting at 5 PM UTC: `cron(0 17/12 ? * * *)`
+ Every two hours, starting at 12 AM UTC: `cron(0 0/2 ? * * *)`

**Notes**  
Although the `cron` expression is set to start at 5 PM UTC, note that there could be a delay of 0-20 minutes from the actual requested time to run the execution.
If you want to run on a daily schedule, don't provide this parameter. SageMaker AI picks a time to run every day.
Currently, SageMaker AI only supports hourly integer rates between 1 hour and 24 hours.

# Configuring service control policies for monitoring schedules
<a name="model-monitor-scp-rules"></a>

 You have to specify the parameters of a monitoring job when you create or update a schedule for it with the [CreateMonitoringSchedule](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateMonitoringSchedule.html) API or the [UpdateMonitoringSchedule](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateMonitoringSchedule.html) API, respectively. Depending on your use case, you can do this in one of the following ways: 
+  You can specify the [MonitoringJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_MonitoringJobDefinition.html) field of [MonitoringScheduleConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_MonitoringScheduleConfig.html), when you invoke `CreateMonitoringSchedule` or `UpdateMonitoringSchedule`. You can use this only to create or update a schedule for a data quality monitoring job. 
+  You can specify the name of a monitoring job definition, that you have already created, for the `MonitoringJobDefinitionName` field of `MonitoringScheduleConfig`, when you invoke `CreateMonitoringSchedule` or `UpdateMonitoringSchedule`. You can use this for any job definition that you create with one of the following APIs: 
  +  [CreateDataQualityJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateDataQualityJobDefinition.html) 
  +  [CreateModelQualityJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelQualityJobDefinition.html) 
  +  [CreateModelBiasJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelBiasJobDefinition.html) 
  +  [CreateModelExplainabilityJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelExplainabilityJobDefinition.html) 

   If you want to use the SageMaker Python SDK to create or update schedules, then you have to use this process. 

 The aforementioned processes are mutually exclusive, that is, you can either specify the `MonitoringJobDefinition` field or the `MonitoringJobDefinitionName` field when creating or updating monitoring schedules. 

 When you create a monitoring job definition, or specify one in the `MonitoringJobDefinition` field, you can set security parameters, such as `NetworkConfig` and `VolumeKmsKeyId`. As an administrator, you might want that these parameters are always set to certain values, so that the monitoring jobs always run in a secure environment. To ensure this, set up appropriate [Service control policies](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_scps.html) (SCPs). SCPs are a type of organization policy that you can use to manage permissions in your organization. 

 The following example shows a SCP that you can use to ensure that infrastructure parameters are properly set when creating or updating schedules for monitoring jobs. 

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Deny",
            "Action": [
                "sagemaker:CreateDataQualityJobDefinition",
                "sagemaker:CreateModelBiasJobDefinition",
                "sagemaker:CreateModelExplainabilityJobDefinition",
                "sagemaker:CreateModelQualityJobDefinition"
            ],
            "Resource": "arn:*:sagemaker:*:*:*",
            "Condition": {
                "Null": {
                    "sagemaker:VolumeKmsKey":"true",
                    "sagemaker:VpcSubnets": "true",
                    "sagemaker:VpcSecurityGroupIds": "true"
                }
            }
        },
        {
            "Effect": "Deny",
            "Action": [
                "sagemaker:CreateDataQualityJobDefinition",
                "sagemaker:CreateModelBiasJobDefinition",
                "sagemaker:CreateModelExplainabilityJobDefinition",
                "sagemaker:CreateModelQualityJobDefinition"
            ],
            "Resource": "arn:*:sagemaker:*:*:*",
            "Condition": {
                "Bool": {
                    "sagemaker:InterContainerTrafficEncryption": "false"
                }
            }
        },
        {
            "Effect": "Deny",
            "Action": [
                "sagemaker:CreateMonitoringSchedule",
                "sagemaker:UpdateMonitoringSchedule"
            ],
            "Resource": "arn:*:sagemaker:*:*:monitoring-schedule/*",
            "Condition": {
                "Null": {
                    "sagemaker:ModelMonitorJobDefinitionName": "true"
                }
            }
        }
    ]
}
```

------

 The first two rules in the example, ensure that the security parameters are always set for monitoring job definitions. The final rule requires that anyone, in your organization, creating or updating a schedule, have to always specify the `MonitoringJobDefinitionName` field. This ensures that no one in your organization, can set insecure values for the security parameters by specifying the `MonitoringJobDefinition` field, when creating or updating schedules. 

# Amazon SageMaker Model Monitor prebuilt container
<a name="model-monitor-pre-built-container"></a>

SageMaker AI provides a built-in image called `sagemaker-model-monitor-analyzer` that provides you with a range of model monitoring capabilities, including constraint suggestion, statistics generation, constraint validation against a baseline, and emitting Amazon CloudWatch metrics. This image is based on Spark version 3.3.0 and is built with [Deequ](https://github.com/awslabs/deequ) version 2.0.2.

**Note**  
You can not pull the built-in `sagemaker-model-monitor-analyzer` image directly. You can use the `sagemaker-model-monitor-analyzer` image when you submit a baseline processing or monitoring job using one of the AWS SDKs.

 Use the SageMaker Python SDK (see `image_uris.retrieve` in the [SageMaker AI Python SDK reference guide](https://sagemaker.readthedocs.io/en/stable/api/utility/image_uris.html)) to generate the ECR image URI for you, or specify the ECR image URI directly. The prebuilt image for SageMaker Model Monitor can be accessed as follows:

`<ACCOUNT_ID>.dkr.ecr.<REGION_NAME>.amazonaws.com/sagemaker-model-monitor-analyzer`

For example: `159807026194.dkr.ecr.us-west-2.amazonaws.com/sagemaker-model-monitor-analyzer`

If you are in an AWS region in China, the prebuilt images for SageMaker Model Monitor can be accessed as follows: 

`<ACCOUNT_ID>.dkr.ecr.<REGION_NAME>.amazonaws.com.rproxy.govskope.us.cn/sagemaker-model-monitor-analyzer`

For account IDs and AWS Region names, see [Docker Registry Paths and Example Code](https://docs.aws.amazon.com/sagemaker/latest/dg-ecr-paths/sagemaker-algo-docker-registry-paths).

To write your own analysis container, see the container contract described in [Custom monitoring schedules](model-monitor-custom-monitoring-schedules.md).

# Interpret results
<a name="model-monitor-interpreting-results"></a>

After you run a baseline processing job and obtained statistics and constraint for your dataset, you can execute monitoring jobs that calculate statistics and list any violations encountered relative to the baseline constraints. Amazon CloudWatch metrics are also reported in your account by default. For information on viewing the results of monitoring in Amazon SageMaker Studio, see [Visualize results for real-time endpoints in Amazon SageMaker Studio](model-monitor-interpreting-visualize-results.md).

## List Executions
<a name="model-monitor-interpreting-results-list-executions"></a>

The schedule starts monitoring jobs at the specified intervals. The following code lists the latest five executions. If you are running this code after creating the hourly schedule, the executions might be empty, and you might have to wait until you cross the hour boundary (in UTC) to see the executions start. The following code includes the logic for waiting.

```
mon_executions = my_default_monitor.list_executions()
print("We created a hourly schedule above and it will kick off executions ON the hour (plus 0 - 20 min buffer.\nWe will have to wait till we hit the hour...")

while len(mon_executions) == 0:
    print("Waiting for the 1st execution to happen...")
    time.sleep(60)
    mon_executions = my_default_monitor.list_executions()
```

## Inspect a Specific Execution
<a name="model-monitor-interpreting-results-inspect-specific-execution"></a>

 
In the previous step, you picked up the latest completed or failed scheduled execution. You can explore what went right or wrong. The terminal states are:
+ `Completed` – The monitoring execution completed and no issues were found in the violations report.
+ `CompletedWithViolations` – The execution completed, but constraint violations were detected.
+ `Failed` – The monitoring execution failed, possibly due to client error (for example, a role issues) or infrastructure issues. To identify the cause, see the `FailureReason` and `ExitMessage`.

```
latest_execution = mon_executions[-1] # latest execution's index is -1, previous is -2 and so on..
time.sleep(60)
latest_execution.wait(logs=False)

print("Latest execution status: {}".format(latest_execution.describe()['ProcessingJobStatus']))
print("Latest execution result: {}".format(latest_execution.describe()['ExitMessage']))

latest_job = latest_execution.describe()
if (latest_job['ProcessingJobStatus'] != 'Completed'):
        print("====STOP==== \n No completed executions to inspect further. Please wait till an execution completes or investigate previously reported failures.")
```

```
report_uri=latest_execution.output.destination
print('Report Uri: {}'.format(report_uri))
```

## List Generated Reports
<a name="model-monitor-interpreting-results-list-generated-reports"></a>

Use the following code to list the generated reports. 

```
from urllib.parse import urlparse
s3uri = urlparse(report_uri)
report_bucket = s3uri.netloc
report_key = s3uri.path.lstrip('/')
print('Report bucket: {}'.format(report_bucket))
print('Report key: {}'.format(report_key))

s3_client = boto3.Session().client('s3')
result = s3_client.list_objects(Bucket=report_bucket, Prefix=report_key)
report_files = [report_file.get("Key") for report_file in result.get('Contents')]
print("Found Report Files:")
print("\n ".join(report_files))
```

## Violations Report
<a name="model-monitor-interpreting-results-violations-report"></a>

If there are violations compared to the baseline, they are generated in the violations report. Use the following code to list the violations.

```
violations = my_default_monitor.latest_monitoring_constraint_violations()
pd.set_option('display.max_colwidth', -1)
constraints_df = pd.io.json.json_normalize(violations.body_dict["violations"])
constraints_df.head(10)
```

This applies only to datasets that contain tabular data. The following schema files specify the statistics calculated and the violations monitored for.

Output Files for Tabular Datasets


| File Name | Description | 
| --- | --- | 
| statistics.json |  Contains columnar statistics for each feature in the dataset that is analyzed. See the schema of this file in the next topic.  This file is created only for data quality monitoring.   | 
| constraint\$1violations.json |  Contains a list of violations found in this current set of data as compared to the baseline statistics and constraints file specified in the `baseline_constaints` and `baseline_statistics` paths.  | 

The [Amazon SageMaker Model Monitor prebuilt container](model-monitor-pre-built-container.md) saves a set of Amazon CloudWatch metrics for each feature by default. 

The container code can emit CloudWatch metrics in this location: `/opt/ml/output/metrics/cloudwatch`. 

# Visualize results for real-time endpoints in Amazon SageMaker Studio
<a name="model-monitor-interpreting-visualize-results"></a>

If you are monitoring a real-time endpoint, you can also visualize the results in Amazon SageMaker Studio. You can view the details of any monitoring job run, and you can create charts that show the baseline and captured values for any metric that the monitoring job calculates.

**To view the detailed results of a monitoring job**

1. Sign in to Studio. For more information, see [Amazon SageMaker AI domain overview](gs-studio-onboard.md).

1. In the left navigation pane, choose the **Components and registries** icon (![\[Orange paper airplane icon representing email or message sending functionality.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/icons/Components_registries.png)).

1. Choose **Endpoints** in the drop-down menu.  
![\[Location of the Endpoints drop-down menu in the console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mm-studio-endpoints.png)

1. On the endpoint tab, choose the monitoring type for which you want to see job details.  
![\[The location of the Model Quality tab in the MODEL MONITORING section.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mm-studio-model-quality.png)

1. Choose the name of the monitoring job run for which you want to view details from the list of monitoring jobs.  
![\[The Model Quality tab of the MOLDEL MONITORING section.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mm-studio-job-history.png)

1. The **MONITORING JOB DETAILS** tab opens with a detailed report of the monitoring job.  
![\[The MONITORING JOB DETAILS tab.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mm-studio-job-details.png)

You can create a chart that displays the baseline and captured metrics for a time period.

**To create a chart in SageMaker Studio to visualize monitoring results**

1. Sign in to Studio. For more information, see [Amazon SageMaker AI domain overview](gs-studio-onboard.md).

1. In the left navigation pane, choose the **Components and registries** icon (![\[Orange paper airplane icon representing email or message sending functionality.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/icons/Components_registries.png)).

1. Choose **Endpoints** in the drop-down menu.  
![\[Location of the Endpoints drop-down menu in the console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mm-studio-endpoints.png)

1. On the **Endpoint** tab, choose the monitoring type you want to create a chart for. This example shows a chart for the **Model quality** monitoring type.  
![\[The location of the Model Quality tab in the MODEL MONITORING section.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mm-studio-model-quality.png)

1. Choose **Add chart**.  
![\[Location of Add chart in the console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mm-studio-add-chart.png)

1. On the **CHART PROPERTIES** tab, choose the time period, statistic, and metric that you want to chart. This example shows a chart for a **Timeline** of **1 week**, the **Average** **Statistic** of, and the **F1** **Metric**.  
![\[Location of where to select a metric in the console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mm-studio-chart-properties.png)

1. The chart that shows the baseline and current metric statistic you chose in the previous step shows up in the **Endpoint** tab.  
![\[Example chart showing the baseline and current average metric chosen in the previous step.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model_monitor/mm-studio-f1-chart.png)

# Advanced topics
<a name="model-monitor-advanced-topics"></a>

The following sections contain more advanced tasks that explain how to customize monitoring using preprocessing and postprocessing scripts, how to build your own container, and how to use CloudFormation to create a monitoring schedule.

**Topics**
+ [Custom monitoring schedules](model-monitor-custom-monitoring-schedules.md)
+ [Create a Monitoring Schedule for a Real-time Endpoint with an CloudFormation Custom Resource](model-monitor-cloudformation-monitoring-schedules.md)

# Custom monitoring schedules
<a name="model-monitor-custom-monitoring-schedules"></a>

In addition to using the built-in monitoring mechanisms, you can create your own custom monitoring schedules and procedures using preprocessing and postprocessing scripts or by using or building your own container.

**Topics**
+ [Preprocessing and Postprocessing](model-monitor-pre-and-post-processing.md)
+ [Support for Your Own Containers With Amazon SageMaker Model Monitor](model-monitor-byoc-containers.md)

# Preprocessing and Postprocessing
<a name="model-monitor-pre-and-post-processing"></a>

You can use custom preprocessing and postprocessing Python scripts to transform the input to your model monitor or extend the code after a successful monitoring run. Upload these scripts to Amazon S3 and reference them when creating your model monitor.

The following example shows how you can customize monitoring schedules with preprocessing and postprocessing scripts. Replace *user placeholder text* with your own information.

```
import boto3, os
from sagemaker import get_execution_role, Session
from sagemaker.model_monitor import CronExpressionGenerator, DefaultModelMonitor

# Upload pre and postprocessor scripts
session = Session()
bucket = boto3.Session().resource("s3").Bucket(session.default_bucket())
prefix = "demo-sagemaker-model-monitor"
pre_processor_script = bucket.Object(os.path.join(prefix, "preprocessor.py")).upload_file("preprocessor.py")
post_processor_script = bucket.Object(os.path.join(prefix, "postprocessor.py")).upload_file("postprocessor.py")

# Get execution role
role = get_execution_role() # can be an empty string

# Instance type
instance_type = "instance-type"
# instance_type = "ml.m5.xlarge" # Example

# Create a monitoring schedule with pre and postprocessing
my_default_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type=instance_type,
    volume_size_in_gb=20,
    max_runtime_in_seconds=3600,
)

s3_report_path = "s3://{}/{}".format(bucket, "reports")
monitor_schedule_name = "monitor-schedule-name"
endpoint_name = "endpoint-name"
my_default_monitor.create_monitoring_schedule(
    post_analytics_processor_script=post_processor_script,
    record_preprocessor_script=pre_processor_script,
    monitor_schedule_name=monitor_schedule_name,
    # use endpoint_input for real-time endpoint
    endpoint_input=endpoint_name,
    # or use batch_transform_input for batch transform jobs
    # batch_transform_input=batch_transform_name,
    output_s3_uri=s3_report_path,
    statistics=my_default_monitor.baseline_statistics(),
    constraints=my_default_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)
```

**Topics**
+ [Preprocessing Script](#model-monitor-pre-processing-script)
+ [Custom Sampling](#model-monitor-pre-processing-custom-sampling)
+ [Postprocessing Script](#model-monitor-post-processing-script)

## Preprocessing Script
<a name="model-monitor-pre-processing-script"></a>

Use preprocessing scripts when you need to transform the inputs to your model monitor.

For example, suppose the output of your model is an array `[1.0, 2.1]`. The Amazon SageMaker Model Monitor container only works with tabular or flattened JSON structures, like `{“prediction0”: 1.0, “prediction1” : 2.1}`. You could use a preprocessing script like the following to transform the array into the correct JSON structure.

```
def preprocess_handler(inference_record):
    input_data = inference_record.endpoint_input.data
    output_data = inference_record.endpoint_output.data.rstrip("\n")
    data = output_data + "," + input_data
    return { str(i).zfill(20) : d for i, d in enumerate(data.split(",")) }
```

In another example, suppose your model has optional features and you use `-1` to denote that the optional feature has a missing value. If you have a data quality monitor, you may want to remove the `-1` from the input value array so that it isn't included in the monitor's metric calculations. You could use a script like the following to remove those values.

```
def preprocess_handler(inference_record):
    input_data = inference_record.endpoint_input.data
    return {i : None if x == -1 else x for i, x in enumerate(input_data.split(","))}
```

Your preprocessing script receives an `inference_record` as its only input. The following code snippet shows an example of an `inference_record`.

```
{
  "captureData": {
    "endpointInput": {
      "observedContentType": "text/csv",
      "mode": "INPUT",
      "data": "132,25,113.2,96,269.9,107,,0,0,0,0,0,0,1,0,1,0,0,1",
      "encoding": "CSV"
    },
    "endpointOutput": {
      "observedContentType": "text/csv; charset=utf-8",
      "mode": "OUTPUT",
      "data": "0.01076381653547287",
      "encoding": "CSV"
    }
  },
  "eventMetadata": {
    "eventId": "feca1ab1-8025-47e3-8f6a-99e3fdd7b8d9",
    "inferenceTime": "2019-11-20T23:33:12Z"
  },
  "eventVersion": "0"
}
```

The following code snippet shows the full class structure for an `inference_record`.

```
KEY_EVENT_METADATA = "eventMetadata"
KEY_EVENT_METADATA_EVENT_ID = "eventId"
KEY_EVENT_METADATA_EVENT_TIME = "inferenceTime"
KEY_EVENT_METADATA_CUSTOM_ATTR = "customAttributes"

KEY_EVENTDATA_ENCODING = "encoding"
KEY_EVENTDATA_DATA = "data"

KEY_GROUND_TRUTH_DATA = "groundTruthData"

KEY_EVENTDATA = "captureData"
KEY_EVENTDATA_ENDPOINT_INPUT = "endpointInput"
KEY_EVENTDATA_ENDPOINT_OUTPUT = "endpointOutput"

KEY_EVENTDATA_BATCH_OUTPUT = "batchTransformOutput"
KEY_EVENTDATA_OBSERVED_CONTENT_TYPE = "observedContentType"
KEY_EVENTDATA_MODE = "mode"

KEY_EVENT_VERSION = "eventVersion"

class EventConfig:
    def __init__(self, endpoint, variant, start_time, end_time):
        self.endpoint = endpoint
        self.variant = variant
        self.start_time = start_time
        self.end_time = end_time


class EventMetadata:
    def __init__(self, event_metadata_dict):
        self.event_id = event_metadata_dict.get(KEY_EVENT_METADATA_EVENT_ID, None)
        self.event_time = event_metadata_dict.get(KEY_EVENT_METADATA_EVENT_TIME, None)
        self.custom_attribute = event_metadata_dict.get(KEY_EVENT_METADATA_CUSTOM_ATTR, None)


class EventData:
    def __init__(self, data_dict):
        self.encoding = data_dict.get(KEY_EVENTDATA_ENCODING, None)
        self.data = data_dict.get(KEY_EVENTDATA_DATA, None)
        self.observedContentType = data_dict.get(KEY_EVENTDATA_OBSERVED_CONTENT_TYPE, None)
        self.mode = data_dict.get(KEY_EVENTDATA_MODE, None)

    def as_dict(self):
        ret = {
            KEY_EVENTDATA_ENCODING: self.encoding,
            KEY_EVENTDATA_DATA: self.data,
            KEY_EVENTDATA_OBSERVED_CONTENT_TYPE: self.observedContentType,
        }
        return ret


class CapturedData:
    def __init__(self, event_dict):
        self.event_metadata = None
        self.endpoint_input = None
        self.endpoint_output = None
        self.batch_transform_output = None
        self.ground_truth = None
        self.event_version = None
        self.event_dict = event_dict
        self._event_dict_postprocessed = False
        
        if KEY_EVENT_METADATA in event_dict:
            self.event_metadata = EventMetadata(event_dict[KEY_EVENT_METADATA])
        if KEY_EVENTDATA in event_dict:
            if KEY_EVENTDATA_ENDPOINT_INPUT in event_dict[KEY_EVENTDATA]:
                self.endpoint_input = EventData(event_dict[KEY_EVENTDATA][KEY_EVENTDATA_ENDPOINT_INPUT])
            if KEY_EVENTDATA_ENDPOINT_OUTPUT in event_dict[KEY_EVENTDATA]:
                self.endpoint_output = EventData(event_dict[KEY_EVENTDATA][KEY_EVENTDATA_ENDPOINT_OUTPUT])
            if KEY_EVENTDATA_BATCH_OUTPUT in event_dict[KEY_EVENTDATA]:
                self.batch_transform_output = EventData(event_dict[KEY_EVENTDATA][KEY_EVENTDATA_BATCH_OUTPUT])

        if KEY_GROUND_TRUTH_DATA in event_dict:
            self.ground_truth = EventData(event_dict[KEY_GROUND_TRUTH_DATA])
        if KEY_EVENT_VERSION in event_dict:
            self.event_version = event_dict[KEY_EVENT_VERSION]

    def as_dict(self):
        if self._event_dict_postprocessed is True:
            return self.event_dict
        if KEY_EVENTDATA in self.event_dict:
            if KEY_EVENTDATA_ENDPOINT_INPUT in self.event_dict[KEY_EVENTDATA]:
                self.event_dict[KEY_EVENTDATA][KEY_EVENTDATA_ENDPOINT_INPUT] = self.endpoint_input.as_dict()
            if KEY_EVENTDATA_ENDPOINT_OUTPUT in self.event_dict[KEY_EVENTDATA]:
                self.event_dict[KEY_EVENTDATA][
                    KEY_EVENTDATA_ENDPOINT_OUTPUT
                ] = self.endpoint_output.as_dict()
            if KEY_EVENTDATA_BATCH_OUTPUT in self.event_dict[KEY_EVENTDATA]:
                self.event_dict[KEY_EVENTDATA][KEY_EVENTDATA_BATCH_OUTPUT] = self.batch_transform_output.as_dict()
        
        self._event_dict_postprocessed = True
        return self.event_dict

    def __str__(self):
        return str(self.as_dict())
```

## Custom Sampling
<a name="model-monitor-pre-processing-custom-sampling"></a>

You can also apply a custom sampling strategy in your preprocessing script. To do this, configure Model Monitor's first-party, pre-built container to ignore a percentage of the records according to your specified sampling rate. In the following example, the handler samples 10 percent of the records by returning the record in 10 percent of handler calls and an empty list otherwise.

```
import random

def preprocess_handler(inference_record):
    # we set up a sampling rate of 0.1
    if random.random() > 0.1:
        # return an empty list
        return []
    input_data = inference_record.endpoint_input.data
    return {i : None if x == -1 else x for i, x in enumerate(input_data.split(","))}
```

### Custom logging for preprocessing script
<a name="model-monitor-pre-processing-custom-logging"></a>

 If your preprocessing script returns an error, check the exception messages logged to CloudWatch to debug. You can access the logger on CloudWatch through the `preprocess_handler` interface. You can log any information you need from your script to CloudWatch. This can be useful when debug your preprocessing script. The following example shows how you can use the `preprocess_handler` interface to log to CloudWatch 

```
def preprocess_handler(inference_record, logger):
    logger.info(f"I'm a processing record: {inference_record}")
    logger.debug(f"I'm debugging a processing record: {inference_record}")
    logger.warning(f"I'm processing record with missing value: {inference_record}")
    logger.error(f"I'm a processing record with bad value: {inference_record}")
    return inference_record
```

## Postprocessing Script
<a name="model-monitor-post-processing-script"></a>

Use a postprocessing script when you want to extend the code following a successful monitoring run.

```
def postprocess_handler():
    print("Hello from post-proc script!")
```

# Support for Your Own Containers With Amazon SageMaker Model Monitor
<a name="model-monitor-byoc-containers"></a>

Amazon SageMaker Model Monitor provides a prebuilt container with ability to analyze the data captured from endpoints or batch transform jobs for tabular datasets. If you would like to bring your own container, Model Monitor provides extension points which you can leverage.

Under the hood, when you create a `MonitoringSchedule`, Model Monitor ultimately kicks off processing jobs. Hence the container needs to be aware of the processing job contract documented in the [How to Build Your Own Processing Container (Advanced Scenario)](build-your-own-processing-container.md) topic. Note that Model Monitor kicks off the processing job on your behalf per the schedule. While invoking, Model Monitor sets up additional environment variables for you so that your container has enough context to process the data for that particular execution of the scheduled monitoring. For additional information on container inputs, see the [Container Contract Inputs](model-monitor-byoc-contract-inputs.md).

In the container, using the above environment variables/context, you can now analyze the dataset for the current period in your custom code. After this analysis is complete, you can chose to emit your reports to be uploaded to an S3 bucket. The reports that the prebuilt container generates are documented in [Container Contract Outputs](model-monitor-byoc-contract-outputs.md). If you would like the visualization of the reports to work in SageMaker Studio, you should follow the same format. You can also choose to emit completely custom reports.

You also emit CloudWatch metrics from the container by following the instructions in [CloudWatch Metrics for Bring Your Own Containers](model-monitor-byoc-cloudwatch.md).

**Topics**
+ [Container Contract Inputs](model-monitor-byoc-contract-inputs.md)
+ [Container Contract Outputs](model-monitor-byoc-contract-outputs.md)
+ [CloudWatch Metrics for Bring Your Own Containers](model-monitor-byoc-cloudwatch.md)

# Container Contract Inputs
<a name="model-monitor-byoc-contract-inputs"></a>

The Amazon SageMaker Model Monitor platform invokes your container code according to a specified schedule. If you choose to write your own container code, the following environment variables are available. In this context, you can analyze the current dataset or evaluate the constraints if you choose and emit metrics, if applicable.

The available environment variables are the same for real-time endpoints and batch transform jobs, except for the `dataset_format` variable. If you are using a real-time endpoint, the `dataset_format` variable supports the following options:

```
{\"sagemakerCaptureJson\": {\"captureIndexNames\": [\"endpointInput\",\"endpointOutput\"]}}
```

If you are using a batch transform job, the `dataset_format` supports the following options:

```
{\"csv\": {\"header\": [\"true\",\"false\"]}}
```

```
{\"json\": {\"line\": [\"true\",\"false\"]}}
```

```
{\"parquet\": {}}
```

The following code sample shows the complete set of environment variables available for your container code (and uses the `dataset_format` format for a real-time endpoint).

```
"Environment": {
 "dataset_format": "{\"sagemakerCaptureJson\": {\"captureIndexNames\": [\"endpointInput\",\"endpointOutput\"]}}",
 "dataset_source": "/opt/ml/processing/endpointdata",
 "end_time": "2019-12-01T16: 20: 00Z",
 "output_path": "/opt/ml/processing/resultdata",
 "publish_cloudwatch_metrics": "Disabled",
 "sagemaker_endpoint_name": "endpoint-name",
 "sagemaker_monitoring_schedule_name": "schedule-name",
 "start_time": "2019-12-01T15: 20: 00Z"
}
```

Parameters 


| Parameter Name | Description | 
| --- | --- | 
| dataset\$1format |  For a job started from a `MonitoringSchedule` backed by an `Endpoint`, this is `sageMakerCaptureJson` with the capture indices `endpointInput`,or `endpointOutput`, or both. For a batch transform job, this specifies the data format, whether CSV, JSON, or Parquet.  | 
| dataset\$1source |  If you are using a real-time endpoint, the local path in which the data corresponding to the monitoring period, as specified by `start_time` and `end_time`, are available. At this path, the data is available in` /{endpoint-name}/{variant-name}/yyyy/mm/dd/hh`. We sometimes download more than what is specified by the start and end times. It is up to the container code to parse the data as required.  | 
| output\$1path |  The local path to write output reports and other files. You specify this parameter in the `CreateMonitoringSchedule` request as `MonitoringOutputConfig.MonitoringOutput[0].LocalPath`. It is uploaded to the `S3Uri` path specified in `MonitoringOutputConfig.MonitoringOutput[0].S3Uri`.  | 
| publish\$1cloudwatch\$1metrics |  For a job launched by `CreateMonitoringSchedule`, this parameter is set to `Enabled`. The container can choose to write the Amazon CloudWatch output file at `[filepath]`.  | 
| sagemaker\$1endpoint\$1name |  If you are using a real-time endpoint, the name of the `Endpoint` that this scheduled job was launched for.  | 
| sagemaker\$1monitoring\$1schedule\$1name |  The name of the `MonitoringSchedule` that launched this job.  | 
| \$1sagemaker\$1endpoint\$1datacapture\$1prefix\$1 |  If you are using a real-time endpoint, the prefix specified in the `DataCaptureConfig` parameter of the `Endpoint`. The container can use this if it needs to directly access more data than already downloaded by SageMaker AI at the `dataset_source` path.  | 
| start\$1time, end\$1time |  The time window for this analysis run. For example, for a job scheduled to run at 05:00 UTC and a job that runs on 20/02/2020, `start_time`: is 2020-02-19T06:00:00Z and `end_time`: is 2020-02-20T05:00:00Z  | 
| baseline\$1constraints: |  The local path of the baseline constraint file specified in` BaselineConfig.ConstraintResource.S3Uri`. This is available only if this parameter was specified in the `CreateMonitoringSchedule` request.  | 
| baseline\$1statistics |  The local path to the baseline statistics file specified in `BaselineConfig.StatisticsResource.S3Uri`. This is available only if this parameter was specified in the `CreateMonitoringSchedule` request.:   | 

# Container Contract Outputs
<a name="model-monitor-byoc-contract-outputs"></a>

The container can analyze the data available in the `*dataset_source*` path and write reports to the path in `*output_path*.` The container code can write any reports that suit your needs.

If you use the following structure and contract, certain output files are treated specially by SageMaker AI in the visualization and API . This applies only to tabular datasets.

Output Files for Tabular Datasets


| File Name | Description | 
| --- | --- | 
| statistics.json |  This file is expected to have columnar statistics for each feature in the dataset that is analyzed. The schema for this file is available in the next section.  | 
| constraints.json |  This file is expected to have the constraints on the features observed. The schema for this file is available in the next section.  | 
| constraints\$1violations.json |  This file is expected to have the list of violations found in this current set of data as compared to the baseline statistics and constraints file specified in the `baseline_constaints` and `baseline_statistics` path.  | 

In addition, if the `publish_cloudwatch_metrics` value is `"Enabled"` container code can emit Amazon CloudWatch metrics in this location: `/opt/ml/output/metrics/cloudwatch`. The schema for these files is described in the following sections.

**Topics**
+ [Schema for Statistics (statistics.json file)](model-monitor-byoc-statistics.md)
+ [Schema for Constraints (constraints.json file)](model-monitor-byoc-constraints.md)

# Schema for Statistics (statistics.json file)
<a name="model-monitor-byoc-statistics"></a>

The schema defined in the `statistics.json` file specifies the statistical parameters to be calculated for the baseline and data that is captured. It also configures the bucket to be used by [KLL](https://datasketches.apache.org/docs/KLL/KLLSketch.html), a very compact quantiles sketch with lazy compaction scheme.

```
{
    "version": 0,
    # dataset level stats
    "dataset": {
        "item_count": number
    },
    # feature level stats
    "features": [
        {
            "name": "feature-name",
            "inferred_type": "Fractional" | "Integral",
            "numerical_statistics": {
                "common": {
                    "num_present": number,
                    "num_missing": number
                },
                "mean": number,
                "sum": number,
                "std_dev": number,
                "min": number,
                "max": number,
                "distribution": {
                    "kll": {
                        "buckets": [
                            {
                                "lower_bound": number,
                                "upper_bound": number,
                                "count": number
                            }
                        ],
                        "sketch": {
                            "parameters": {
                                "c": number,
                                "k": number
                            },
                            "data": [
                                [
                                    num,
                                    num,
                                    num,
                                    num
                                ],
                                [
                                    num,
                                    num
                                ][
                                    num,
                                    num
                                ]
                            ]
                        }#sketch
                    }#KLL
                }#distribution
            }#num_stats
        },
        {
            "name": "feature-name",
            "inferred_type": "String",
            "string_statistics": {
                "common": {
                    "num_present": number,
                    "num_missing": number
                },
                "distinct_count": number,
                "distribution": {
                    "categorical": {
                         "buckets": [
                                {
                                    "value": "string",
                                    "count": number
                                }
                          ]
                     }
                }
            },
            #provision for custom stats
        }
    ]
}
```

**Notes**  
The specified metrics are recognized by SageMaker AI in later visualization changes. The container can emit more metrics if required.
[KLL sketch](https://datasketches.apache.org/docs/KLL/KLLSketch.html) is the recognized sketch. Custom containers can write their own representation, but it won’t be recognized by SageMaker AI in visualizations.
By default, the distribution is materialized in 10 buckets. You can't change this.

# Schema for Constraints (constraints.json file)
<a name="model-monitor-byoc-constraints"></a>

A constraints.json file is used to express the constraints that a dataset must satisfy. Amazon SageMaker Model Monitor containers can use the constraints.json file to evaluate datasets against. Prebuilt containers provide the ability to generate the constraints.json file automatically for a baseline dataset. If you bring your own container, you can provide it with similar abilities or you can create the constraints.json file in some other way. Here is the schema for the constraint file that the prebuilt container uses. Bring your own containers can adopt the same format or enhance it as required.

```
{
    "version": 0,
    "features":
    [
        {
            "name": "string",
            "inferred_type": "Integral" | "Fractional" | 
                    | "String" | "Unknown",
            "completeness": number,
            "num_constraints":
            {
                "is_non_negative": boolean
            },
            "string_constraints":
            {
                "domains":
                [
                    "list of",
                    "observed values",
                    "for small cardinality"
                ]
            },
            "monitoringConfigOverrides":
            {}
        }
    ],
    "monitoring_config":
    {
        "evaluate_constraints": "Enabled",
        "emit_metrics": "Enabled",
        "datatype_check_threshold": 0.1,
        "domain_content_threshold": 0.1,
        "distribution_constraints":
        {
            "perform_comparison": "Enabled",
            "comparison_threshold": 0.1,
            "comparison_method": "Simple"||"Robust",
            "categorical_comparison_threshold": 0.1,
            "categorical_drift_method": "LInfinity"||"ChiSquared"
        }
    }
}
```

The `monitoring_config` object contains options for monitoring job for the feature. The following table describes each option.

Monitoring Constraints

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-byoc-constraints.html)

# CloudWatch Metrics for Bring Your Own Containers
<a name="model-monitor-byoc-cloudwatch"></a>

If the `publish_cloudwatch_metrics` value is `Enabled` in the `Environment` map in the `/opt/ml/processing/processingjobconfig.json` file, the container code emits Amazon CloudWatch metrics in this location: `/opt/ml/output/metrics/cloudwatch`. 

The schema for this file is closely based on the CloudWatch `PutMetrics` API. The namespace is not specified here. It defaults to the following:
+ `For real-time endpoints: /aws/sagemaker/Endpoint/data-metrics`
+ `For batch transform jobs: /aws/sagemaker/ModelMonitoring/data-metrics`

However, you can specify dimensions. We recommend you add the following dimensions at minimum:
+ `Endpoint` and `MonitoringSchedule` for real-time endpoints
+ `MonitoringSchedule` for batch transform jobs

The following JSON snippets show how to set your dimensions.

For a real-time endpoint, see the following JSON snippet which includes the `Endpoint` and `MonitoringSchedule` dimensions:

```
{ 
    "MetricName": "", # Required
    "Timestamp": "2019-11-26T03:00:00Z", # Required
    "Dimensions" : [{"Name":"Endpoint","Value":"endpoint_0"},{"Name":"MonitoringSchedule","Value":"schedule_0"}]
    "Value": Float,
    # Either the Value or the StatisticValues field can be populated and not both.
    "StatisticValues": {
        "SampleCount": Float,
        "Sum": Float,
        "Minimum": Float,
        "Maximum": Float
    },
    "Unit": "Count", # Optional
}
```

For a batch transform job, see the following JSON snippet which includes the `MonitoringSchedule` dimension:

```
{ 
    "MetricName": "", # Required
    "Timestamp": "2019-11-26T03:00:00Z", # Required
    "Dimensions" : [{"Name":"MonitoringSchedule","Value":"schedule_0"}]
    "Value": Float,
    # Either the Value or the StatisticValues field can be populated and not both.
    "StatisticValues": {
        "SampleCount": Float,
        "Sum": Float,
        "Minimum": Float,
        "Maximum": Float
    },
    "Unit": "Count", # Optional
}
```

# Create a Monitoring Schedule for a Real-time Endpoint with an CloudFormation Custom Resource
<a name="model-monitor-cloudformation-monitoring-schedules"></a>

If you are using a real-time endpoint, you can use a CloudFormation custom resource to create a monitoring schedule. The custom resource is in Python. To deploy it, see [Python Lambda deployment](https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html).

## Custom Resource
<a name="model-monitor-cloudformation-custom-resource"></a>

Start by adding a custom resource to your CloudFormation template. This points to a AWS Lambda function that you create in the next step. 

This resource enables you to customize the parameters for the monitoring schedule You can add or remove more parameters by modifying the CloudFormation resource and the Lambda function in the following example resource.

```
{
    "AWSTemplateFormatVersion": "2010-09-09",
    "Resources": {
        "MonitoringSchedule": {
            "Type": "Custom::MonitoringSchedule",
            "Version": "1.0",
            "Properties": {
                "ServiceToken": "arn:aws:lambda:us-west-2:111111111111:function:lambda-name",
                "ScheduleName": "YourScheduleName",
                "EndpointName": "YourEndpointName",
                "BaselineConstraintsUri": "s3://your-baseline-constraints/constraints.json",
                "BaselineStatisticsUri": "s3://your-baseline-stats/statistics.json",
                "PostAnalyticsProcessorSourceUri": "s3://your-post-processor/postprocessor.py",
                "RecordPreprocessorSourceUri": "s3://your-preprocessor/preprocessor.py",
                "InputLocalPath": "/opt/ml/processing/endpointdata",
                "OutputLocalPath": "/opt/ml/processing/localpath",
                "OutputS3URI": "s3://your-output-uri",
                "ImageURI": "111111111111.dkr.ecr.us-west-2.amazonaws.com/your-image",
                "ScheduleExpression": "cron(0 * ? * * *)",
                "PassRoleArn": "arn:aws:iam::111111111111:role/AmazonSageMaker-ExecutionRole"
            }
        }
    }
}
```

## Lambda Custom Resource Code
<a name="model-monitor-cloudformation-lambda-custom-resource-code"></a>

This CloudFormation custom resource uses the [Custom Resource Helper](https://github.com/aws-cloudformation/custom-resource-helper) AWS library, which you can install with pip using `pip install crhelper`. 

This Lambda function is invoked by CloudFormation during the creation and deletion of the stack. This Lambda function is responsible for creating and deleting the monitoring schedule and using the parameters defined in the custom resource described in the preceding section.

```
import boto3
import botocore
import logging

from crhelper import CfnResource
from botocore.exceptions import ClientError


logger = logging.getLogger(__name__)
sm = boto3.client('sagemaker')

# cfnhelper makes it easier to implement a CloudFormation custom resource
helper = CfnResource()

# CFN Handlers

def handler(event, context):
    helper(event, context)


@helper.create
def create_handler(event, context):
    """
    Called when CloudFormation custom resource sends the create event
    """
    create_monitoring_schedule(event)


@helper.delete
def delete_handler(event, context):
    """
    Called when CloudFormation custom resource sends the delete event
    """
    schedule_name = get_schedule_name(event)
    delete_monitoring_schedule(schedule_name)


@helper.poll_create
def poll_create(event, context):
    """
    Return true if the resource has been created and false otherwise so
    CloudFormation polls again.
    """
    schedule_name = get_schedule_name(event)
    logger.info('Polling for creation of schedule: %s', schedule_name)
    return is_schedule_ready(schedule_name)

@helper.update
def noop():
    """
    Not currently implemented but crhelper will throw an error if it isn't added
    """
    pass

# Helper Functions

def get_schedule_name(event):
    return event['ResourceProperties']['ScheduleName']

def create_monitoring_schedule(event):
    schedule_name = get_schedule_name(event)
    monitoring_schedule_config = create_monitoring_schedule_config(event)

    logger.info('Creating monitoring schedule with name: %s', schedule_name)

    sm.create_monitoring_schedule(
        MonitoringScheduleName=schedule_name,
        MonitoringScheduleConfig=monitoring_schedule_config)

def is_schedule_ready(schedule_name):
    is_ready = False

    schedule = sm.describe_monitoring_schedule(MonitoringScheduleName=schedule_name)
    status = schedule['MonitoringScheduleStatus']

    if status == 'Scheduled':
        logger.info('Monitoring schedule (%s) is ready', schedule_name)
        is_ready = True
    elif status == 'Pending':
        logger.info('Monitoring schedule (%s) still creating, waiting and polling again...', schedule_name)
    else:
        raise Exception('Monitoring schedule ({}) has unexpected status: {}'.format(schedule_name, status))

    return is_ready

def create_monitoring_schedule_config(event):
    props = event['ResourceProperties']

    return {
        "ScheduleConfig": {
            "ScheduleExpression": props["ScheduleExpression"],
        },
        "MonitoringJobDefinition": {
            "BaselineConfig": {
                "ConstraintsResource": {
                    "S3Uri": props['BaselineConstraintsUri'],
                },
                "StatisticsResource": {
                    "S3Uri": props['BaselineStatisticsUri'],
                }
            },
            "MonitoringInputs": [
                {
                    "EndpointInput": {
                        "EndpointName": props["EndpointName"],
                        "LocalPath": props["InputLocalPath"],
                    }
                }
            ],
            "MonitoringOutputConfig": {
                "MonitoringOutputs": [
                    {
                        "S3Output": {
                            "S3Uri": props["OutputS3URI"],
                            "LocalPath": props["OutputLocalPath"],
                        }
                    }
                ],
            },
            "MonitoringResources": {
                "ClusterConfig": {
                    "InstanceCount": 1,
                    "InstanceType": "ml.t3.medium",
                    "VolumeSizeInGB": 50,
                }
            },
            "MonitoringAppSpecification": {
                "ImageUri": props["ImageURI"],
                "RecordPreprocessorSourceUri": props['PostAnalyticsProcessorSourceUri'],
                "PostAnalyticsProcessorSourceUri": props['PostAnalyticsProcessorSourceUri'],
            },
            "StoppingCondition": {
                "MaxRuntimeInSeconds": 300
            },
            "RoleArn": props["PassRoleArn"],
        }
    }


def delete_monitoring_schedule(schedule_name):
    logger.info('Deleting schedule: %s', schedule_name)
    try:
        sm.delete_monitoring_schedule(MonitoringScheduleName=schedule_name)
    except ClientError as e:
        if e.response['Error']['Code'] == 'ResourceNotFound':
            logger.info('Resource not found, nothing to delete')
        else:
            logger.error('Unexpected error while trying to delete monitoring schedule')
            raise e
```

# Model Monitor FAQs
<a name="model-monitor-faqs"></a>

Refer to the following FAQs for more information about Amazon SageMaker Model Monitor.

**Q: How do Model Monitor and SageMaker Clarify help customers monitor model behavior?**

Customers can monitor model behavior along four dimensions - [Data quality](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-quality.html), [Model quality](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality.html), [Bias drift](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-model-monitor-bias-drift.html), and [Feature Attribution drift](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-model-monitor-feature-attribution-drift.html) through Amazon SageMaker Model Monitor and SageMaker Clarify. [Model Monitor](https://aws.amazon.com/sagemaker/model-monitor/) continuously monitors the quality of Amazon SageMaker AI machine learning models in production. This includes monitoring drift in data quality and model quality metrics such as accuracy and RMSE. [SageMaker Clarify](https://aws.amazon.com/sagemaker/clarify/?sagemaker-data-wrangler-whats-new.sort-by=item.additionalFields.postDateTime&sagemaker-data-wrangler-whats-new.sort-order=desc) bias monitoring helps data scientists and ML engineers monitor bias in model’s prediction and feature attribution drift.

**Q: What happens in the background when Sagemaker Model monitor is enabled?**

Amazon SageMaker Model Monitor automates model monitoring alleviating the need to monitor the models manually or building any additional tooling. In order to automate the process, Model Monitor provides you with the ability to create a set of baseline statistics and constraints using the data with which your model was trained, then set up a schedule to monitor the predictions made on your endpoint. Model Monitor uses rules to detect drift in your models and alerts you when it happens. The following steps describe what happens when you enable model monitoring:
+ **Enable model monitoring**: For a real-time endpoint, you have to enable the endpoint to capture data from incoming requests to a deployed ML model and the resulting model predictions. For a batch transform job, enable data capture of the batch transform inputs and outputs.
+ **Baseline processing job**: You then create a baseline from the dataset that was used to train the model. The baseline computes metrics and suggests constraints for the metrics. For example, the recall score for the model shouldn't regress and drop below 0.571, or the precision score shouldn't fall below 1.0. Real-time or batch predictions from your model are compared to the constraints and are reported as violations if they are outside the constrained values.
+ **Monitoring job**: Then, you create a monitoring schedule specifying what data to collect, how often to collect it, how to analyze it, and which reports to produce.
+ **Merge job**: This is only applicable if you are leveraging Amazon SageMaker Ground Truth. Model Monitor compares the predictions your model makes with Ground Truth labels to measure the quality of the model. For this to work, you periodically label data captured by your endpoint or batch transform job and upload it to Amazon S3. 

  After you create and upload the Ground Truth labels, include the location of the labels as a parameter when you create the monitoring job. 

When you use Model Monitor to monitor a batch transform job instead of a real-time endpoint, instead of receiving requests to an endpoint and tracking the predictions, Model Monitor monitors inference inputs and outputs. In a Model Monitor schedule, the customer provides the count and type of instances that are to be used in the processing job. These resources remain reserved until the schedule is deleted irrespective of the status of current execution.

**Q: What is Data Capture, why is it required, and how can I enable it?**

In order to log the inputs to the model endpoint and the inference outputs from the deployed model to Amazon S3, you can enable a feature called [Data Capture](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-capture.html). For more details about how to enable it for a real-time endpoint and batch transform job, see [Capture data from real-time endpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-capture-endpoint.html) and [Capture data from batch transform job](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-capture-batch.html).

**Q: Does enabling Data Capture impact the performance of a real-time endpoint ?**

Data Capture happens asynchronously without impacting production traffic. After you have enabled the data capture, then the request and response payload, along with some additional meta data, is saved in the Amazon S3 location that you specified in the `DataCaptureConfig`. Note that there can be a delay in the propagation of the captured data to Amazon S3.

You can also view the captured data by listing the data capture files stored in Amazon S3. The format of the Amazon S3 path is: `s3:///{endpoint-name}/{variant-name}/yyyy/mm/dd/hh/filename.jsonl`. Amazon S3 Data Capture should be in the same region as the Model Monitor schedule. You should also ensure that the column names for the baseline dataset only have lowercase letters and an underscore (`_`) as the only separator.

**Q: Why is Ground Truth needed for model monitoring?**

Ground Truth labels are required by the following features of Model Monitor:
+ **Model quality monitoring** compares the predictions your model makes with Ground Truth labels to measure the quality of the model.
+ **Model bias monitoring** monitors predictions for bias. One way bias can be introduced in deployed ML models is when the data used in training differs from the data used to generate predictions. This is especially pronounced if the data used for training changes over time (such as fluctuating mortgage rates), and the model prediction is not as accurate unless the model is retrained with updated data. For example, a model for predicting home prices can be biased if the mortgage rates used to train the model differ from the most current real-world mortgage rate.

**Q: For customers leveraging Ground Truth for labeling, what are the steps I can take to monitor the quality of the model?**

Model quality monitoring compares the predictions your model makes with Ground Truth labels to measure the quality of the model. For this to work, you periodically label data captured by your endpoint or batch transform job and upload it to Amazon S3. Besides captures, model bias monitoring execution also requires Ground Truth data. In real use cases, Ground Truth data should be regularly collected and uploaded to the designated Amazon S3 location. To match Ground Truth labels with captured prediction data, there must be a unique identifier for each record in the dataset. For the structure of each record for Ground Truth data, see [Ingest Ground Truth Labels and Merge Them With Predictions](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-merge.html).

The following code example can be used for generating artificial Ground Truth data for a tabular dataset.

```
import random

def ground_truth_with_id(inference_id):
    random.seed(inference_id)  # to get consistent results
    rand = random.random()
    # format required by the merge container
    return {
        "groundTruthData": {
            "data": "1" if rand < 0.7 else "0",  # randomly generate positive labels 70% of the time
            "encoding": "CSV",
        },
        "eventMetadata": {
            "eventId": str(inference_id),
        },
        "eventVersion": "0",
    }


def upload_ground_truth(upload_time):
    records = [ground_truth_with_id(i) for i in range(test_dataset_size)]
    fake_records = [json.dumps(r) for r in records]
    data_to_upload = "\n".join(fake_records)
    target_s3_uri = f"{ground_truth_upload_path}/{upload_time:%Y/%m/%d/%H/%M%S}.jsonl"
    print(f"Uploading {len(fake_records)} records to", target_s3_uri)
    S3Uploader.upload_string_as_file_body(data_to_upload, target_s3_uri)
# Generate data for the last hour
upload_ground_truth(datetime.utcnow() - timedelta(hours=1))
# Generate data once a hour
def generate_fake_ground_truth(terminate_event):
    upload_ground_truth(datetime.utcnow())
    for _ in range(0, 60):
        time.sleep(60)
        if terminate_event.is_set():
            break


ground_truth_thread = WorkerThread(do_run=generate_fake_ground_truth)
ground_truth_thread.start()
```

The following code example shows how to generate artificial traffic to send to the model endpoint. Notice the `inferenceId` attribute used above to invoke. If this is present, it is used to join with Ground Truth data (otherwise, the `eventId` is used).

```
import threading

class WorkerThread(threading.Thread):
    def __init__(self, do_run, *args, **kwargs):
        super(WorkerThread, self).__init__(*args, **kwargs)
        self.__do_run = do_run
        self.__terminate_event = threading.Event()

    def terminate(self):
        self.__terminate_event.set()

    def run(self):
        while not self.__terminate_event.is_set():
            self.__do_run(self.__terminate_event)
def invoke_endpoint(terminate_event):
    with open(test_dataset, "r") as f:
        i = 0
        for row in f:
            payload = row.rstrip("\n")
            response = sagemaker_runtime_client.invoke_endpoint(
                EndpointName=endpoint_name,
                ContentType="text/csv",
                Body=payload,
                InferenceId=str(i),  # unique ID per row
            )
            i += 1
            response["Body"].read()
            time.sleep(1)
            if terminate_event.is_set():
                break


# Keep invoking the endpoint with test data
invoke_endpoint_thread = WorkerThread(do_run=invoke_endpoint)
invoke_endpoint_thread.start()
```

You must upload Ground Truth data to an Amazon S3 bucket that has the same path format as captured data, which is in the following format: `s3://<bucket>/<prefix>/yyyy/mm/dd/hh`

**Note**  
The date in this path is the date when the Ground Truth label is collected. It doesn't have to match the date when the inference was generated.

**Q: How can customers customize monitoring schedules?**

In addition to using the built-in monitoring mechanisms, you can create your own custom monitoring schedules and procedures using pre-processing and post-processing scripts, or by using or building your own container. It's important to note that pre-processing and post-processing scripts only work with data and model quality jobs.

Amazon SageMaker AI provides the capability for you to monitor and evaluate the data observed by the model endpoints. For this, you have to create a baseline with which you compare the real-time traffic. After a baseline is ready, set up a schedule to continuously evaluate and compare against the baseline. While creating a schedule, you can provide the pre-processing and post-processing script.

The following example shows how you can customize monitoring schedules with pre-processing and post-processing scripts.

```
import boto3, osfrom sagemaker import get_execution_role, Sessionfrom sagemaker.model_monitor import CronExpressionGenerator, DefaultModelMonitor
# Upload pre and postprocessor scripts
session = Session()
bucket = boto3.Session().resource("s3").Bucket(session.default_bucket())
prefix = "demo-sagemaker-model-monitor"
pre_processor_script = bucket.Object(os.path.join(prefix, "preprocessor.py")).upload_file("preprocessor.py")
post_processor_script = bucket.Object(os.path.join(prefix, "postprocessor.py")).upload_file("postprocessor.py")
# Get execution role
role = get_execution_role() # can be an empty string
# Instance type
instance_type = "instance-type"
# instance_type = "ml.m5.xlarge" # Example
# Create a monitoring schedule with pre and post-processing
my_default_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type=instance_type,
    volume_size_in_gb=20,
    max_runtime_in_seconds=3600,
)

s3_report_path = "s3://{}/{}".format(bucket, "reports")
monitor_schedule_name = "monitor-schedule-name"
endpoint_name = "endpoint-name"
my_default_monitor.create_monitoring_schedule(
    post_analytics_processor_script=post_processor_script,
    record_preprocessor_script=pre_processor_script,
    monitor_schedule_name=monitor_schedule_name,
    # use endpoint_input for real-time endpoint
    endpoint_input=endpoint_name,
    # or use batch_transform_input for batch transform jobs
# batch_transform_input=batch_transform_name,
    output_s3_uri=s3_report_path,
    statistics=my_default_monitor.baseline_statistics(),
    constraints=my_default_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)
```

**Q: What are some of the scenarios or use cases where I can leverage a pre-processing script?**

You can use pre-processing scripts when you need to transform the inputs to your model monitor. Consider the following example scenarios:

1. Pre-processing script for data transformation.

   Suppose the output of your model is an array: `[1.0, 2.1]`. The Model Monitor container only works with tabular or flattened JSON structures, such as `{“prediction0”: 1.0, “prediction1” : 2.1}`. You could use a pre-processing script like the following example to transform the array into the correct JSON structure.

   ```
   def preprocess_handler(inference_record):
       input_data = inference_record.endpoint_input.data
       output_data = inference_record.endpoint_output.data.rstrip("\n")
       data = output_data + "," + input_data
       return { str(i).zfill(20) : d for i, d in enumerate(data.split(",")) }
   ```

1. Exclude certain records from Model Monitor's metric calculations.

   Suppose your model has optional features and you use `-1` to denote that the optional feature has a missing value. If you have a data quality monitor, you may want to remove the `-1` from the input value array so that it isn't included in the monitor's metric calculations. You could use a script like the following to remove those values.

   ```
   def preprocess_handler(inference_record):
       input_data = inference_record.endpoint_input.data
       return {i : None if x == -1 else x for i, x in enumerate(input_data.split(","))}
   ```

1. Apply a custom sampling strategy.

   You can also apply a custom sampling strategy in your pre-processing script. To do this, configure Model Monitor's first-party, pre-built container to ignore a percentage of the records according to your specified sampling rate. In the following example, the handler samples 10% of the records by returning the record in 10% of handler calls and an empty list otherwise.

   ```
   import random
   
   def preprocess_handler(inference_record):
       # we set up a sampling rate of 0.1
       if random.random() > 0.1:
           # return an empty list
           return []
       input_data = inference_record.endpoint_input.data
       return {i : None if x == -1 else x for i, x in enumerate(input_data.split(","))}
   ```

1. Use custom logging.

   You can log any information you need from your script to Amazon CloudWatch. This can be useful when debugging your pre-processing script in case of an error. The following example shows how you can use the `preprocess_handler` interface to log to CloudWatch.

   ```
   def preprocess_handler(inference_record, logger):
       logger.info(f"I'm a processing record: {inference_record}")
       logger.debug(f"I'm debugging a processing record: {inference_record}")
       logger.warning(f"I'm processing record with missing value: {inference_record}")
       logger.error(f"I'm a processing record with bad value: {inference_record}")
       return inference_record
   ```

**Note**  
When the pre-processing script is run on batch transform data, the input type is not always the `CapturedData` object. For CSV data, the type is a string. For JSON data, the type is a Python dictionary.

**Q: When can I leverage a post-processing script?**

You can leverage a post-processing script as an extension following a successful monitoring run. The following is a simple example, but you can perform or call any business function that you need to perform after a successful monitoring run.

```
def postprocess_handler(): 
    print("Hello from the post-processing script!")
```

**Q: When should I consider bringing my own container for model monitoring?**

SageMaker AI provides a pre-built container for analyzing data captured from endpoints or batch transform jobs for tabular datasets. However, there are scenarios where you might want to create your own container. Consider the following scenarios:
+ You have regulatory and compliance requirements to only use the containers that are created and maintained internally in your organization.
+ If you want to include a few third-party libraries, you can place a `requirements.txt` file in a local directory and reference it using the `source_dir` parameter in the [SageMaker AI estimator](https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.Estimator), which enables library installation at run-time. However, if you have lots of libraries or dependencies that increase the installation time while running the training job, you might want to leverage BYOC.
+ Your environment forces no internet connectivity (or Silo), which prevents package download.
+ You want to monitor data that's in data formats other than tabular, such as NLP or CV use cases.
+ When you require additional monitoring metrics than the ones supported by Model Monitor.

**Q: I have NLP and CV models. How do I monitor them for data drift?**

Amazon SageMaker AI's prebuilt container supports tabular datasets. If you want to monitor NLP and CV models, you can bring your own container by leveraging the extension points provided by Model Monitor. For more details about the requirements, see [Bring your own containers](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-byoc-containers.html). Some of the following are more examples:
+ For a detailed explanation of how to use Model Monitor for a computer vision use case, see [Detecting and Analyzing incorrect predictions](https://aws.amazon.com/blogs/machine-learning/detecting-and-analyzing-incorrect-model-predictions-with-amazon-sagemaker-model-monitor-and-debugger/).
+ For a scenario where Model Monitor can be leveraged for a NLP use case, see [Detect NLP data drift using custom Amazon SageMaker Model Monitor](https://aws.amazon.com/blogs/machine-learning/detect-nlp-data-drift-using-custom-amazon-sagemaker-model-monitor/).

**Q: I want to delete the model endpoint for which Model Monitor was enabled, but I'm unable to do so since the monitoring schedule is still active. What should I do?**

If you want to delete an inference endpoint hosted in SageMaker AI which has Model Monitor enabled, first you must delete the model monitoring schedule (with the `DeleteMonitoringSchedule` [CLI](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/delete-monitoring-schedule.html) or [API](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DeleteMonitoringSchedule.html)). Then, delete the endpoint.

**Q: Does SageMaker Model Monitor calculate metrics and statistics for input?**

Model Monitor calculates metrics and statistics for output, not input.

**Q: Does SageMaker Model Monitor support multi-model endpoints?**

No, Model Monitor only supports endpoints that host a single model and doesn't support monitoring multi-model endpoints.

**Q: Does SageMaker Model Monitor provide monitoring data about individual containers in an inference pipeline?**

Model Monitor supports monitoring inference pipelines, but capturing and analyzing data is done for the entire pipeline, not for individual containers in the pipeline.

**Q: What can I do to prevent impact to inference requests when data capture is set up?**

To prevent impact to inference requests, Data Capture stops capturing requests at high levels of disk usage. It is recommended you keep your disk utilization below 75% in order to ensure data capture continues capturing requests.

**Q: Can Amazon S3 Data Capture be in a different AWS region than the region in which the monitoring schedule was set up?**

No, Amazon S3 Data Capture must be in the same region as the monitoring schedule.

**Q: What is a baseline, and how do I create one? Can I create a custom baseline?**

A baseline is used as a reference to compare real-time or batch predictions from the model. It computes statistics and metrics along with constraints on them. During monitoring, all of these are used in conjunction to identify violations.

To use the default solution of Amazon SageMaker Model Monitor, you can leverage the [Amazon SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html). Specifically, use the [suggest\$1baseline](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html#sagemaker.model_monitor.model_monitoring.DefaultModelMonitor.suggest_baseline) method of the [ModelMonitor](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html#sagemaker.model_monitor.model_monitoring.DefaultModelMonitor) or the [ModelQualityMonitor](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html#sagemaker.model_monitor.model_monitoring.ModelQualityMonitor) class to trigger a processing job that computes the metrics and constraints for the baseline.

The result of a baseline job are two files: `statistics.json` and `constraints.json`. [Schema for statistics](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-byoc-statistics.html) and [schema for constraints](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-byoc-constraints.html) contain the schema of the respective files. You can review the generated constraints and modify them before using them for monitoring. Based on your understanding of the domain and business problem, you can make a constraint more aggressive, or relax it to control the number and nature of the violations.

**Q: What are the guidelines to create a baseline dataset?**

The primary requirement for any kind of monitoring is to have a baseline dataset that is used to compute metrics and constraints. Typically, this is the training dataset used by the model, but in some cases you might choose to use some other reference dataset.

The column names of the baseline dataset should be compatible with Spark. In order to maintain the maximum compatibility between Spark, CSV, JSON and parquet it is advisable to use only lowercase letters, and only use `_` as the separator. Special characters including `“ ”` can cause issues.

**Q: What are the `StartTimeOffset` and `EndTimeOffset` parameters, and when are they used?**

When Amazon SageMaker Ground Truth is required for monitoring jobs like model quality, you need to ensure that a monitoring job only uses data for which Ground Truth is available. The `start_time_offset` and `end_time_offset` parameters of [EndpointInput](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html#sagemaker.model_monitor.model_monitoring.EndpointInput) can be used to select the data that the monitoring job uses. The monitoring job uses the data in the time window that is defined by `start_time_offset` and `end_time_offset`. These parameters need to be specified in the [ISO 8601 duration format](https://en.wikipedia.org/wiki/ISO_8601#Durations). The following are some examples:
+ If your Ground Truth results arrive 3 days after the predictions have been made, set `start_time_offset="-P3D"` and `end_time_offset="-P1D"`, which is 3 days and 1 day respectively.
+ If the Ground Truth results arrive 6 hours after the predictions and you have an hourly schedule, set `start_time_offset="-PT6H"` and `end_time_offset="-PT1H"`, which is 6 hours and 1 hour.

**Q: Can I run ‘on demand' monitoring jobs?**

Yes, you can run ‘on demand’ monitoring jobs by running a SageMaker Processing job. For Batch Transform, [Pipelines](https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-overview.html) has a [MonitorBatchTransformStep](https://sagemaker.readthedocs.io/en/stable/workflows/pipelines/sagemaker.workflow.pipelines.html#sagemaker.workflow.monitor_batch_transform_step.MonitorBatchTransformStep) that you can use to create a SageMaker AI pipeline that runs monitoring jobs on demand. The SageMaker AI examples repository has code samples for running [data quality](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker_model_monitor/model_monitor_batch_transform/SageMaker-ModelMonitoring-Batch-Transform-Data-Quality-With-SageMaker-Pipelines-On-Demand.ipynb) and [model quality](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker_model_monitor/model_monitor_batch_transform/SageMaker-ModelMonitoring-Batch-Transform-Model-Quality-With-SageMaker-Pipelines-On-Demand.ipynb) monitoring jobs on demand.

**Q: How do I set up Model Monitor?**

You can set up Model Monitor in the following ways:
+ **[Amazon SageMaker AI Python SDK](https://sagemaker.readthedocs.io/en/stable/index.html)** – There is a [Model Monitor module](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html) which contains classes and functions that assist in suggesting baselines, creating monitoring schedules, and more. See the [Amazon SageMaker Model Monitor notebook examples](https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker_model_monitor) for detailed notebooks that leverage the SageMaker AI Python SDK for setting up Model Monitor.
+ **[Pipelines](https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-overview.html)** – Pipelines are integrated with Model Monitor through the [QualityCheck Step](https://docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-steps.html#step-type-quality-check) and [ClarifyCheckStep](https://docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-steps.html#step-type-clarify-check) APIs. You can create a SageMaker AI pipeline that contains these steps and that can be used to run monitoring jobs on demand whenever the pipeline is executed.
+ **[Amazon SageMaker Studio Classic](https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html)** – You can create a data or model quality monitoring schedule along with model bias and explainability schedules directly from the UI by selecting an endpoint from the list of deployed model endpoints. Schedules for other types of monitoring can be created by selecting the relevant tab in the UI.
+ **[SageMaker Model Dashboard](https://docs.aws.amazon.com/sagemaker/latest/dg/model-dashboard.html)** – You can enable monitoring on endpoints by selecting a model that has been deployed to an endpoint. In the following screenshot of the SageMaker AI console, a model named `group1` has been selected from the **Models** section of the **Model dashboard**. On this page, you can create a monitoring schedule, and you can edit, activate or deactivate existing monitoring schedules and alerts. For a step by step guide on how to view alerts and model monitor schedules, see [View Model Monitor schedules and alerts](https://docs.aws.amazon.com/sagemaker/latest/dg/model-dashboard-schedule.html).

![\[Screenshot of the Model dashboard, showing the option to create a monitoring schedule.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/model-monitoring-faqs-screenshot.png)


**Q: How does Model Monitor Integrate with SageMaker Model Dashboard**

[SageMaker Model Dashboard](https://docs.aws.amazon.com/sagemaker/latest/dg/model-dashboard.html) gives you unified monitoring across all your models by providing automated alerts about deviations from expected behavior and troubleshooting to inspect models and analyze factors impacting model performance over time.