Inference configuration for a model.
The maximum number of tokens to generate.
Stop sequences that end generation.
The temperature for sampling.
The top-p value for nucleus sampling.