Skip to content

/AWS1/CL_SGMPRDVARIANTMGDINS00

Configures the scale-in behavior for managed instance scaling.

CONSTRUCTOR

IMPORTING

Required arguments:

iv_strategy TYPE /AWS1/SGMMGDINSCASCALEINSTGY /AWS1/SGMMGDINSCASCALEINSTGY

The strategy for scaling in instances.

IDLE_RELEASE

Releases instances that have no hosted inference component copies.

CONSOLIDATION

Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.

Optional arguments:

iv_maximumstepsize TYPE /AWS1/SGMMGDINSCAMAXSTEPSIZE /AWS1/SGMMGDINSCAMAXSTEPSIZE

The maximum number of instances that the endpoint can terminate at a time during a consolidation scale-in operation.

Default value: 1.

iv_cooldowninminutes TYPE /AWS1/SGMMGDINSCACOOLDOWNINMIN /AWS1/SGMMGDINSCACOOLDOWNINMIN

The cooldown period, in minutes, after the last endpoint operation before the endpoint evaluates consolidation scale-in opportunities.

Default value: 20.


Queryable Attributes

Strategy

The strategy for scaling in instances.

IDLE_RELEASE

Releases instances that have no hosted inference component copies.

CONSOLIDATION

Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.

Accessible with the following methods

Method Description
GET_STRATEGY() Getter for STRATEGY, with configurable default
ASK_STRATEGY() Getter for STRATEGY w/ exceptions if field has no value
HAS_STRATEGY() Determine if STRATEGY has a value

MaximumStepSize

The maximum number of instances that the endpoint can terminate at a time during a consolidation scale-in operation.

Default value: 1.

Accessible with the following methods

Method Description
GET_MAXIMUMSTEPSIZE() Getter for MAXIMUMSTEPSIZE, with configurable default
ASK_MAXIMUMSTEPSIZE() Getter for MAXIMUMSTEPSIZE w/ exceptions if field has no val
HAS_MAXIMUMSTEPSIZE() Determine if MAXIMUMSTEPSIZE has a value

CooldownInMinutes

The cooldown period, in minutes, after the last endpoint operation before the endpoint evaluates consolidation scale-in opportunities.

Default value: 20.

Accessible with the following methods

Method Description
GET_COOLDOWNINMINUTES() Getter for COOLDOWNINMINUTES, with configurable default
ASK_COOLDOWNINMINUTES() Getter for COOLDOWNINMINUTES w/ exceptions if field has no v
HAS_COOLDOWNINMINUTES() Determine if COOLDOWNINMINUTES has a value