CreateAiRecommendationJobRequest
Types
Properties
The name of the AI recommendation job. The name must be unique within your Amazon Web Services account in the current Amazon Web Services Region.
The name or Amazon Resource Name (ARN) of the AI workload configuration to use for this recommendation job.
The compute resource specification for the recommendation job. You can specify up to 3 instance types to consider, and optionally provide capacity reservation configuration.
The inference framework configuration. Specify the framework (such as LMI or vLLM) for the recommendation job.
The source of the model to optimize. Specify the Amazon S3 location of the model artifacts.
Whether to allow model optimization techniques such as quantization, speculative decoding, and kernel tuning. The default is true.
The output configuration for the recommendation job, including the Amazon S3 location for results and an optional model package group where the optimized model is registered.
The performance targets for the recommendation job. Specify constraints on metrics such as time to first token (ttft-ms), throughput, or cost.