BedrockRuntime / Client / invoke_guardrail_checks

invoke_guardrail_checks

BedrockRuntime.Client.invoke_guardrail_checks(**kwargs)

Evaluates messages against inline guardrail checks. You specify the check configurations directly in the request, and Amazon Bedrock returns per-check results with severity or confidence scores.

See also: AWS API Documentation

Request Syntax

response = client.invoke_guardrail_checks(
    messages=[
        {
            'role': 'user'|'assistant'|'system',
            'content': [
                {
                    'text': 'string'
                },
            ]
        },
    ],
    checks={
        'contentFilter': {
            'categories': [
                {
                    'category': 'VIOLENCE'|'HATE'|'SEXUAL'|'MISCONDUCT'|'INSULTS'
                },
            ]
        },
        'promptAttack': {
            'categories': [
                {
                    'category': 'JAILBREAK'|'PROMPT_INJECTION'|'PROMPT_LEAKAGE'
                },
            ]
        },
        'sensitiveInformation': {
            'entities': [
                {
                    'type': 'ADDRESS'|'AGE'|'AWS_ACCESS_KEY'|'AWS_SECRET_KEY'|'CA_HEALTH_NUMBER'|'CA_SOCIAL_INSURANCE_NUMBER'|'CREDIT_DEBIT_CARD_CVV'|'CREDIT_DEBIT_CARD_EXPIRY'|'CREDIT_DEBIT_CARD_NUMBER'|'DRIVER_ID'|'EMAIL'|'INTERNATIONAL_BANK_ACCOUNT_NUMBER'|'IP_ADDRESS'|'LICENSE_PLATE'|'MAC_ADDRESS'|'NAME'|'PASSWORD'|'PHONE'|'PIN'|'SWIFT_CODE'|'UK_NATIONAL_HEALTH_SERVICE_NUMBER'|'UK_NATIONAL_INSURANCE_NUMBER'|'UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER'|'URL'|'USERNAME'|'US_BANK_ACCOUNT_NUMBER'|'US_BANK_ROUTING_NUMBER'|'US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER'|'US_PASSPORT_NUMBER'|'US_SOCIAL_SECURITY_NUMBER'|'VEHICLE_IDENTIFICATION_NUMBER'
                },
            ]
        }
    }
)
Parameters:
  • messages (list) –

    [REQUIRED]

    The messages to evaluate against the specified guardrail checks. Each message includes a role and one or more content blocks.

    • (dict) –

      A message to evaluate against guardrail checks, containing a role and content blocks.

      • role (string) – [REQUIRED]

        The role of the message sender.

      • content (list) – [REQUIRED]

        The content blocks for the message.

        • (dict) –

          A content block within a message to evaluate.

          Note

          This is a Tagged Union structure. Only one of the following top level keys can be set: text.

          • text (string) –

            The text content to evaluate.

  • checks (dict) –

    [REQUIRED]

    The inline check configurations that specify which guardrail checks to run against the messages.

    • contentFilter (dict) –

      The content filter check configuration.

      • categories (list) – [REQUIRED]

        The content filter categories to evaluate.

        • (dict) –

          The configuration for a single content filter category to evaluate.

          • category (string) – [REQUIRED]

            The content filter category to evaluate.

    • promptAttack (dict) –

      The prompt attack check configuration.

      • categories (list) – [REQUIRED]

        The prompt attack categories to evaluate.

        • (dict) –

          The configuration for a single prompt attack category to evaluate.

          • category (string) – [REQUIRED]

            The prompt attack category to evaluate.

    • sensitiveInformation (dict) –

      The sensitive information check configuration.

      • entities (list) – [REQUIRED]

        The sensitive information entity types to detect.

        • (dict) –

          The configuration for a single sensitive information entity type to detect.

          • type (string) – [REQUIRED]

            The PII entity type to detect.

Return type:

dict

Returns:

Response Syntax

{
    'results': {
        'contentFilter': {
            'results': [
                {
                    'category': 'VIOLENCE'|'HATE'|'SEXUAL'|'MISCONDUCT'|'INSULTS',
                    'severityScore': 123.0
                },
            ]
        },
        'promptAttack': {
            'results': [
                {
                    'category': 'JAILBREAK'|'PROMPT_INJECTION'|'PROMPT_LEAKAGE',
                    'severityScore': 123.0
                },
            ]
        },
        'sensitiveInformation': {
            'results': [
                {
                    'type': 'ADDRESS'|'AGE'|'AWS_ACCESS_KEY'|'AWS_SECRET_KEY'|'CA_HEALTH_NUMBER'|'CA_SOCIAL_INSURANCE_NUMBER'|'CREDIT_DEBIT_CARD_CVV'|'CREDIT_DEBIT_CARD_EXPIRY'|'CREDIT_DEBIT_CARD_NUMBER'|'DRIVER_ID'|'EMAIL'|'INTERNATIONAL_BANK_ACCOUNT_NUMBER'|'IP_ADDRESS'|'LICENSE_PLATE'|'MAC_ADDRESS'|'NAME'|'PASSWORD'|'PHONE'|'PIN'|'SWIFT_CODE'|'UK_NATIONAL_HEALTH_SERVICE_NUMBER'|'UK_NATIONAL_INSURANCE_NUMBER'|'UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER'|'URL'|'USERNAME'|'US_BANK_ACCOUNT_NUMBER'|'US_BANK_ROUTING_NUMBER'|'US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER'|'US_PASSPORT_NUMBER'|'US_SOCIAL_SECURITY_NUMBER'|'VEHICLE_IDENTIFICATION_NUMBER',
                    'confidenceScore': 123.0,
                    'beginOffset': 123,
                    'endOffset': 123,
                    'messageIndex': 123,
                    'contentIndex': 123
                },
            ],
            'truncated': True|False
        }
    },
    'usage': {
        'contentFilter': {
            'textUnits': 123
        },
        'promptAttack': {
            'textUnits': 123
        },
        'sensitiveInformation': {
            'textUnits': 123
        }
    }
}

Response Structure

  • (dict) –

    • results (dict) –

      The per-check results containing findings from the guardrail evaluation.

      • contentFilter (dict) –

        The content filter check results.

        • results (list) –

          The per-category content filter results.

          • (dict) –

            The evaluation result for a single content filter category.

            • category (string) –

              The content filter category that was evaluated.

            • severityScore (float) –

              The severity score for the category, ranging from 0.0 to 1.0. Higher values indicate greater severity.

      • promptAttack (dict) –

        The prompt attack check results.

        • results (list) –

          The per-category prompt attack results.

          • (dict) –

            The evaluation result for a single prompt attack category.

            • category (string) –

              The prompt attack category that was evaluated.

            • severityScore (float) –

              The severity score for the category, ranging from 0.0 to 1.0. Higher values indicate greater severity.

      • sensitiveInformation (dict) –

        The sensitive information check results.

        • results (list) –

          The detected sensitive information entities.

          • (dict) –

            The detection result for a single sensitive information entity found in the evaluated messages.

            • type (string) –

              The PII entity type that was detected.

            • confidenceScore (float) –

              The confidence score for the detection, ranging from 0.0 to 1.0. Higher values indicate greater confidence.

            • beginOffset (integer) –

              The start character offset of the detected entity within the content block.

            • endOffset (integer) –

              The end character offset of the detected entity within the content block.

            • messageIndex (integer) –

              The zero-based index of the message in the input messages array where the entity was detected.

            • contentIndex (integer) –

              The zero-based index of the content block within the message where the entity was detected.

        • truncated (boolean) –

          Specifies whether the results were truncated because the number of detected entities exceeded the maximum limit.

    • usage (dict) –

      The per-check text unit consumption for the guardrail evaluation.

      • contentFilter (dict) –

        The text unit usage for the content filter check.

        • textUnits (integer) –

          The number of text units consumed by the content filter check.

      • promptAttack (dict) –

        The text unit usage for the prompt attack check.

        • textUnits (integer) –

          The number of text units consumed by the prompt attack check.

      • sensitiveInformation (dict) –

        The text unit usage for the sensitive information check.

        • textUnits (integer) –

          The number of text units consumed by the sensitive information check.

Exceptions

  • BedrockRuntime.Client.exceptions.AccessDeniedException

  • BedrockRuntime.Client.exceptions.ThrottlingException

  • BedrockRuntime.Client.exceptions.InternalServerException

  • BedrockRuntime.Client.exceptions.ServiceUnavailableException

  • BedrockRuntime.Client.exceptions.ValidationException