ProductionVariantManagedInstanceScaling - Amazon SageMaker

ProductionVariantManagedInstanceScaling

Settings that control the range in the number of instances that the endpoint provisions as it scales up or down to accommodate traffic.

Contents

MaxInstanceCount

The maximum number of instances that the endpoint can provision when it scales up to accommodate an increase in traffic.

Type: Integer

Valid Range: Minimum value of 1.

Required: No

MinInstanceCount

The minimum number of instances that the endpoint must retain when it scales down to accommodate a decrease in traffic.

Type: Integer

Valid Range: Minimum value of 0.

Required: No

ScaleInPolicy

Configures the scale-in behavior for managed instance scaling.

Type: ProductionVariantManagedInstanceScalingScaleInPolicy object

Required: No

Status

Indicates whether managed instance scaling is enabled.

Type: String

Valid Values: ENABLED | DISABLED

Required: No

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: