

# Container Contract Outputs
<a name="model-monitor-byoc-contract-outputs"></a>

The container can analyze the data available in the `*dataset_source*` path and write reports to the path in `*output_path*.` The container code can write any reports that suit your needs.

If you use the following structure and contract, certain output files are treated specially by SageMaker AI in the visualization and API . This applies only to tabular datasets.

Output Files for Tabular Datasets


| File Name | Description | 
| --- | --- | 
| statistics.json |  This file is expected to have columnar statistics for each feature in the dataset that is analyzed. The schema for this file is available in the next section.  | 
| constraints.json |  This file is expected to have the constraints on the features observed. The schema for this file is available in the next section.  | 
| constraints\$1violations.json |  This file is expected to have the list of violations found in this current set of data as compared to the baseline statistics and constraints file specified in the `baseline_constaints` and `baseline_statistics` path.  | 

In addition, if the `publish_cloudwatch_metrics` value is `"Enabled"` container code can emit Amazon CloudWatch metrics in this location: `/opt/ml/output/metrics/cloudwatch`. The schema for these files is described in the following sections.

**Topics**
+ [Schema for Statistics (statistics.json file)](model-monitor-byoc-statistics.md)
+ [Schema for Constraints (constraints.json file)](model-monitor-byoc-constraints.md)

# Schema for Statistics (statistics.json file)
<a name="model-monitor-byoc-statistics"></a>

The schema defined in the `statistics.json` file specifies the statistical parameters to be calculated for the baseline and data that is captured. It also configures the bucket to be used by [KLL](https://datasketches.apache.org/docs/KLL/KLLSketch.html), a very compact quantiles sketch with lazy compaction scheme.

```
{
    "version": 0,
    # dataset level stats
    "dataset": {
        "item_count": number
    },
    # feature level stats
    "features": [
        {
            "name": "feature-name",
            "inferred_type": "Fractional" | "Integral",
            "numerical_statistics": {
                "common": {
                    "num_present": number,
                    "num_missing": number
                },
                "mean": number,
                "sum": number,
                "std_dev": number,
                "min": number,
                "max": number,
                "distribution": {
                    "kll": {
                        "buckets": [
                            {
                                "lower_bound": number,
                                "upper_bound": number,
                                "count": number
                            }
                        ],
                        "sketch": {
                            "parameters": {
                                "c": number,
                                "k": number
                            },
                            "data": [
                                [
                                    num,
                                    num,
                                    num,
                                    num
                                ],
                                [
                                    num,
                                    num
                                ][
                                    num,
                                    num
                                ]
                            ]
                        }#sketch
                    }#KLL
                }#distribution
            }#num_stats
        },
        {
            "name": "feature-name",
            "inferred_type": "String",
            "string_statistics": {
                "common": {
                    "num_present": number,
                    "num_missing": number
                },
                "distinct_count": number,
                "distribution": {
                    "categorical": {
                         "buckets": [
                                {
                                    "value": "string",
                                    "count": number
                                }
                          ]
                     }
                }
            },
            #provision for custom stats
        }
    ]
}
```

**Notes**  
The specified metrics are recognized by SageMaker AI in later visualization changes. The container can emit more metrics if required.
[KLL sketch](https://datasketches.apache.org/docs/KLL/KLLSketch.html) is the recognized sketch. Custom containers can write their own representation, but it won’t be recognized by SageMaker AI in visualizations.
By default, the distribution is materialized in 10 buckets. You can't change this.

# Schema for Constraints (constraints.json file)
<a name="model-monitor-byoc-constraints"></a>

A constraints.json file is used to express the constraints that a dataset must satisfy. Amazon SageMaker Model Monitor containers can use the constraints.json file to evaluate datasets against. Prebuilt containers provide the ability to generate the constraints.json file automatically for a baseline dataset. If you bring your own container, you can provide it with similar abilities or you can create the constraints.json file in some other way. Here is the schema for the constraint file that the prebuilt container uses. Bring your own containers can adopt the same format or enhance it as required.

```
{
    "version": 0,
    "features":
    [
        {
            "name": "string",
            "inferred_type": "Integral" | "Fractional" | 
                    | "String" | "Unknown",
            "completeness": number,
            "num_constraints":
            {
                "is_non_negative": boolean
            },
            "string_constraints":
            {
                "domains":
                [
                    "list of",
                    "observed values",
                    "for small cardinality"
                ]
            },
            "monitoringConfigOverrides":
            {}
        }
    ],
    "monitoring_config":
    {
        "evaluate_constraints": "Enabled",
        "emit_metrics": "Enabled",
        "datatype_check_threshold": 0.1,
        "domain_content_threshold": 0.1,
        "distribution_constraints":
        {
            "perform_comparison": "Enabled",
            "comparison_threshold": 0.1,
            "comparison_method": "Simple"||"Robust",
            "categorical_comparison_threshold": 0.1,
            "categorical_drift_method": "LInfinity"||"ChiSquared"
        }
    }
}
```

The `monitoring_config` object contains options for monitoring job for the feature. The following table describes each option.

Monitoring Constraints

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-byoc-constraints.html)