SagemakerJobRuntimeService / Client / sample_with_response_stream

sample_with_response_stream¶

SagemakerJobRuntimeService.Client.sample_with_response_stream(**kwargs)¶

Sends a streaming inference request to the model during a job execution. Returns the response as a stream of payload chunks. Each turn is captured for later use.

Request Syntax

response = client.sample_with_response_stream(
    JobArn='string',
    TrajectoryId='string',
    Body=b'bytes'|file
)

Parameters:

JobArn (string) –
[REQUIRED]

The job ARN that identifies which model session to route the inference request to.
TrajectoryId (string) –
[REQUIRED]

The trajectory ID for grouping turns into a single rollout. Each turn is captured for later use.
Body (bytes or seekable file-like object) –
[REQUIRED]

The raw inference request body in OpenAI-compatible JSON format.

Return type:

dict

Returns:

Response Syntax

{
    'ContentType': 'string',
    'Body': StreamingBody()
}

Response Structure

(dict) –
- ContentType (string) –
  
  MIME type of the streaming inference result.
- Body (StreamingBody) –
  
  The streaming response body, delivered as a series of PayloadPart events.

Exceptions

SagemakerJobRuntimeService.Client.exceptions.ResourceNotFoundException
SagemakerJobRuntimeService.Client.exceptions.InternalServiceError
SagemakerJobRuntimeService.Client.exceptions.ValidationException
SagemakerJobRuntimeService.Client.exceptions.ServiceQuotaExceededException
SagemakerJobRuntimeService.Client.exceptions.ThrottlingException
SagemakerJobRuntimeService.Client.exceptions.AccessDeniedException