strategy

The strategy for scaling in instances.

IDLE_RELEASE

Releases instances that have no hosted inference component copies.

CONSOLIDATION

Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.