BedrockRuntime / Client / invoke_guardrail_checks

invoke_guardrail_checks¶

BedrockRuntime.Client.invoke_guardrail_checks(**kwargs)¶

Evaluates messages against inline guardrail checks. You specify the check configurations directly in the request, and Amazon Bedrock returns per-check results with severity or confidence scores.

Request Syntax

response = client.invoke_guardrail_checks(
    messages=[
        {
            'role': 'user'|'assistant'|'system',
            'content': [
                {
                    'text': 'string'
                },
            ]
        },
    ],
    checks={
        'contentFilter': {
            'categories': [
                {
                    'category': 'VIOLENCE'|'HATE'|'SEXUAL'|'MISCONDUCT'|'INSULTS'
                },
            ]
        },
        'promptAttack': {
            'categories': [
                {
                    'category': 'JAILBREAK'|'PROMPT_INJECTION'|'PROMPT_LEAKAGE'
                },
            ]
        },
        'sensitiveInformation': {
            'entities': [
                {
                    'type': 'ADDRESS'|'AGE'|'AWS_ACCESS_KEY'|'AWS_SECRET_KEY'|'CA_HEALTH_NUMBER'|'CA_SOCIAL_INSURANCE_NUMBER'|'CREDIT_DEBIT_CARD_CVV'|'CREDIT_DEBIT_CARD_EXPIRY'|'CREDIT_DEBIT_CARD_NUMBER'|'DRIVER_ID'|'EMAIL'|'INTERNATIONAL_BANK_ACCOUNT_NUMBER'|'IP_ADDRESS'|'LICENSE_PLATE'|'MAC_ADDRESS'|'NAME'|'PASSWORD'|'PHONE'|'PIN'|'SWIFT_CODE'|'UK_NATIONAL_HEALTH_SERVICE_NUMBER'|'UK_NATIONAL_INSURANCE_NUMBER'|'UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER'|'URL'|'USERNAME'|'US_BANK_ACCOUNT_NUMBER'|'US_BANK_ROUTING_NUMBER'|'US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER'|'US_PASSPORT_NUMBER'|'US_SOCIAL_SECURITY_NUMBER'|'VEHICLE_IDENTIFICATION_NUMBER'
                },
            ]
        }
    }
)

Parameters:

messages (list) –
[REQUIRED]

The messages to evaluate against the specified guardrail checks. Each message includes a role and one or more content blocks.
- (dict) –
  
  A message to evaluate against guardrail checks, containing a role and content blocks.
  - role (string) – [REQUIRED]
    
    The role of the message sender.
  - content (list) – [REQUIRED]
    
    The content blocks for the message.
    - (dict) –
      
      A content block within a message to evaluate.
      
      Note
      This is a Tagged Union structure. Only one of the following top level keys can be set: text.
      - text (string) –
        
        The text content to evaluate.
checks (dict) –
[REQUIRED]

The inline check configurations that specify which guardrail checks to run against the messages.
- contentFilter (dict) –
  
  The content filter check configuration.
  - categories (list) – [REQUIRED]
    
    The content filter categories to evaluate.
    - (dict) –
      
      The configuration for a single content filter category to evaluate.
      - category (string) – [REQUIRED]
        
        The content filter category to evaluate.
- promptAttack (dict) –
  
  The prompt attack check configuration.
  - categories (list) – [REQUIRED]
    
    The prompt attack categories to evaluate.
    - (dict) –
      
      The configuration for a single prompt attack category to evaluate.
      - category (string) – [REQUIRED]
        
        The prompt attack category to evaluate.
- sensitiveInformation (dict) –
  
  The sensitive information check configuration.
  - entities (list) – [REQUIRED]
    
    The sensitive information entity types to detect.
    - (dict) –
      
      The configuration for a single sensitive information entity type to detect.
      - type (string) – [REQUIRED]
        
        The PII entity type to detect.

Return type:

dict

Returns:

Response Syntax

{
    'results': {
        'contentFilter': {
            'results': [
                {
                    'category': 'VIOLENCE'|'HATE'|'SEXUAL'|'MISCONDUCT'|'INSULTS',
                    'severityScore': 123.0
                },
            ]
        },
        'promptAttack': {
            'results': [
                {
                    'category': 'JAILBREAK'|'PROMPT_INJECTION'|'PROMPT_LEAKAGE',
                    'severityScore': 123.0
                },
            ]
        },
        'sensitiveInformation': {
            'results': [
                {
                    'type': 'ADDRESS'|'AGE'|'AWS_ACCESS_KEY'|'AWS_SECRET_KEY'|'CA_HEALTH_NUMBER'|'CA_SOCIAL_INSURANCE_NUMBER'|'CREDIT_DEBIT_CARD_CVV'|'CREDIT_DEBIT_CARD_EXPIRY'|'CREDIT_DEBIT_CARD_NUMBER'|'DRIVER_ID'|'EMAIL'|'INTERNATIONAL_BANK_ACCOUNT_NUMBER'|'IP_ADDRESS'|'LICENSE_PLATE'|'MAC_ADDRESS'|'NAME'|'PASSWORD'|'PHONE'|'PIN'|'SWIFT_CODE'|'UK_NATIONAL_HEALTH_SERVICE_NUMBER'|'UK_NATIONAL_INSURANCE_NUMBER'|'UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER'|'URL'|'USERNAME'|'US_BANK_ACCOUNT_NUMBER'|'US_BANK_ROUTING_NUMBER'|'US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER'|'US_PASSPORT_NUMBER'|'US_SOCIAL_SECURITY_NUMBER'|'VEHICLE_IDENTIFICATION_NUMBER',
                    'confidenceScore': 123.0,
                    'beginOffset': 123,
                    'endOffset': 123,
                    'messageIndex': 123,
                    'contentIndex': 123
                },
            ],
            'truncated': True|False
        }
    },
    'usage': {
        'contentFilter': {
            'textUnits': 123
        },
        'promptAttack': {
            'textUnits': 123
        },
        'sensitiveInformation': {
            'textUnits': 123
        }
    }
}

Response Structure

(dict) –
- results (dict) –
  
  The per-check results containing findings from the guardrail evaluation.
  - contentFilter (dict) –
    
    The content filter check results.
    - results (list) –
      
      The per-category content filter results.
      - (dict) –
        
        The evaluation result for a single content filter category.
        
        category (string) –
        
        The content filter category that was evaluated.
        
        severityScore (float) –
        
        The severity score for the category, ranging from 0.0 to 1.0. Higher values indicate greater severity.
  - promptAttack (dict) –
    
    The prompt attack check results.
    - results (list) –
      
      The per-category prompt attack results.
      - (dict) –
        
        The evaluation result for a single prompt attack category.
        
        category (string) –
        
        The prompt attack category that was evaluated.
        
        severityScore (float) –
        
        The severity score for the category, ranging from 0.0 to 1.0. Higher values indicate greater severity.
  - sensitiveInformation (dict) –
    
    The sensitive information check results.
    - results (list) –
      
      The detected sensitive information entities.
      - (dict) –
        
        The detection result for a single sensitive information entity found in the evaluated messages.
        
        type (string) –
        
        The PII entity type that was detected.
        
        confidenceScore (float) –
        
        The confidence score for the detection, ranging from 0.0 to 1.0. Higher values indicate greater confidence.
        
        beginOffset (integer) –
        
        The start character offset of the detected entity within the content block.
        
        endOffset (integer) –
        
        The end character offset of the detected entity within the content block.
        
        messageIndex (integer) –
        
        The zero-based index of the message in the input messages array where the entity was detected.
        
        contentIndex (integer) –
        
        The zero-based index of the content block within the message where the entity was detected.
    - truncated (boolean) –
      
      Specifies whether the results were truncated because the number of detected entities exceeded the maximum limit.
- usage (dict) –
  
  The per-check text unit consumption for the guardrail evaluation.
  - contentFilter (dict) –
    
    The text unit usage for the content filter check.
    - textUnits (integer) –
      
      The number of text units consumed by the content filter check.
  - promptAttack (dict) –
    
    The text unit usage for the prompt attack check.
    - textUnits (integer) –
      
      The number of text units consumed by the prompt attack check.
  - sensitiveInformation (dict) –
    
    The text unit usage for the sensitive information check.
    - textUnits (integer) –
      
      The number of text units consumed by the sensitive information check.

Exceptions

BedrockRuntime.Client.exceptions.AccessDeniedException
BedrockRuntime.Client.exceptions.ThrottlingException
BedrockRuntime.Client.exceptions.InternalServerException
BedrockRuntime.Client.exceptions.ServiceUnavailableException
BedrockRuntime.Client.exceptions.ValidationException