

# AWS Database Migration Service
<a name="automation-ref-dms"></a>

AWS Systems Manager Automation provides predefined runbooks for AWS Database Migration Service. For more information about runbooks, see [Working with runbooks](https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-documents.html) . For information about how to view runbook content, see [View runbook content](automation-runbook-reference.md#view-automation-json) . 

**Topics**
+ [`AWSSupport-TroubleshootDMSTableErrors`](awssupport-troubleshoot-dms-table-errors.md)
+ [`AWSSupport-TroubleshootDMSEndpointConnection`](automation-awssupport-troubleshootdmsendpointconnection.md)

# `AWSSupport-TroubleshootDMSTableErrors`
<a name="awssupport-troubleshoot-dms-table-errors"></a>

 **Description** 

 The AWS Systems Manager **AWSSuport-TroubleshootDMSTableErrors** automation runbook helps you to automate the troubleshooting process for `Table errors` found in Database migration task or Serverless replication from AWS Database Migration Service. These errors occur when tables fail to migrate from the source endpoint (source database) to the target endpoint (target database) by the Database migration task or Serverless replication created in AWS DMS service. This runbook analyzes the signature error messages from CloudWatch logs, specifically focusing on task logs for traditional Database migration task and serverless logs for Serverless replication. It also provides targeted suggestions and remediation steps for common error messages encountered with `Table error` during AWS DMS migrations. 

 **How does it work?** 

 The runbook performs the following steps: 
+ Fetches information about the provided AWS DMS ARN, which can be either a Database migration task or a Serverless replication.
+ Verifies if the provided AWS DMS resource has been started at least once by checking the `FreshStartDate` value in the DescribeReplicationTasks API (for Database migration task) and DescribeReplications API (for Serverless replication) response. If the resource has not started, the automation raises an error.
+ If the resource has started, the automation checks for the tables in the `TableError` states using `TableStatistics` information. If no errors are found, the automation ends the workflow after displaying a message confirming no table errors found in the specified Database migration task or Serverless replication.
+ If tables with `TableError` state are found, the automation checks if CloudWatch logging is enabled for the specified AWS DMS resource. If logging is not enabled, the automation ends the workflow after displaying a message indicating that logging is not enabled. 

  **Note: ** CloudWatch logging is expected to be enabled, as the automation relies on these logs to analyze and identify the issues with the tables in `TableError` state.
+ If logging is enabled, the automation analyzes the CloudWatch logs and generates a report for each table which is in `TableError` state. The report includes suggestions for common error message and provides relevant error logs to help identify and resolve issues preventing successful table migration from the AWS DMS source endpoint to AWS DMS target endpoint.

 [Run this Automation (console)](https://console.aws.amazon.com/systems-manager/automation/execute/AWSSupport-TroubleshootDMSTableErrors) 

**Document type**

Automation

**Owner**

Amazon

**Platforms**

/

**Parameters**
+ AutomationAssumeRole

  Type: String

  Description: (Optional) The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role that allows Systems Manager Automation to perform the actions on your behalf. If no role is specified, Systems Manager Automation uses the permissions of the user that starts this runbook.
+ DMSArn

  Type: String

  Description: (Required) ARN of the Database migration task or Serverless replication 

  Allowed Pattern: `^arn:(aws|aws-cn|aws-us-gov|aws-iso|aws-iso-b):dms:[a-z0-9-]+:\d{12}:(task|replication-config):[a-zA-Z0-9-]+$`
+ StartTimeRange

  Type: String

  Description: (Optional) This parameter defines the beginning of the time range for CloudWatch logs analysis of the given Database Migration task or Serverless replication. When provided, only logs generated from this specific time onward will be collected and analyzed. Please note, there is a possibility that the workflow could timeout if the time range between the `startDate` and `endDate` is too long. The value should be provided in ISO 6081 date time format.

  Allowed Pattern: `^$|^(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2})\\.(\\d{3})Z$` 
+ EndTimeRange

  Type: String

  Description: (Optional) This parameter sets the end of the time range for CloudWatch log analysis of the given Database migration task or Serverless replication. When provided, only logs generated till this specific time will be collected and analyzed. Please note, there is a possibility that the workflow could timeout if the time range between the `startDate` and `endDate` is too long. The value should be provided in ISO 6081 date time format.

  Allowed Pattern: `^$|^(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2})\\.(\\d{3})Z$` 

**Required IAM permissions**

The `AutomationAssumeRole` parameter requires the following actions to use the runbook successfully.
+ `dms:DescribeReplicationTasks`
+ `dms:DescribeReplications`
+ `dms:DescribeEndpoints`
+ `dms:DescribeReplicationConfigs`
+ `dms:DescribeTableStatistics`
+ `dms:DescribeReplicationTableStatistics`
+ `logs:FilterLogEvents`

 **Example IAM Policy for the Automation Assume Role** 

------
#### [ JSON ]

****  

```
            {
                "Version":"2012-10-17",		 	 	 
                "Statement": [
                    {
                        "Sid": "VisualEditor0",
                        "Effect": "Allow",
                        "Action": [
                            "dms:DescribeReplicationConfigs",
                            "dms:DescribeEndpoints",
                            "dms:DescribeReplicationTableStatistics",
                            "dms:DescribeTableStatistics",
                            "logs:FilterLogEvents",
                            "dms:DescribeReplicationTasks",
                            "dms:DescribeReplications"
                        ],
                        "Resource": "*"
                    }
                ]
            }
```

------

 **Instructions** 

Follow these steps to configure the automation:

1. Navigate to [https://console.aws.amazon.com/systems-manager/documents/AWSSupport-TroubleshootDMSTableErrors/description](https://console.aws.amazon.com/systems-manager/documents/AWSSupport-TroubleshootDMSTableErrors/description) in Systems Manager under Documents.

1. Select Execute automation.

1. For the input parameters, enter the following:
   + **AutomationAssumeRole (Optional):**

     The Amazon Resource Name (ARN) of the AWS AWS Identity and Access Management (IAM) role that allows Systems Manager Automation to perform the actions on your behalf. If no role is specified, Systems Manager Automation uses the permissions of the user who starts this runbook.
   + **DMSArn**

     ARN of the Database migration task or Serverless replication which has Table errors. 
   + **StartTimeRange**

     (Optional) ISO 6081 date time format defining the start of the time range for analyzing CloudWatch logs of the given Database migration task or Serverless replication.
   + **EndTimeRange**

     (Optional) ISO 6081 date time format defining the end of the time range for analyzing CloudWatch logs of the given Database migration task or Serverless replication.

1. Select **Execute** button from bottom of the page.

1. The automation initiates.

1. The document performs the following steps:
   + **validateDMSInputTypeAndGatherDetails**

     Validates the given AWS DMS ARN input and gather the basic details of the Database migration task or Serverless replication which are required in the next steps.
   + **branchOnTableErrors**

     Branches the workflow based on the number of Table errors found in the above step. If count is greater than 0, then proceed to - `branchOnCWLoggingStatus` step. Else, proceed to - `outputNoTableErrors` step.
   + **outputNoTableErrors**

     Output a message stating that the table errors are not found in the given Database migration task or Serverless replication.
   + **branchOnCWLoggingStatus**

     Branches the workflow based on the CloudWatch logging status found in the above step. If enabled, then proceed to - `gatherTableDetails` step. Else, proceed to - `outputNoCWLoggingEnabled` step.
   + **outputNoCWLoggingEnabled**

     Outputs a message stating that the CloudWatch logging is not enabled in the given Database migration task or Serverless replication.
   + **gatherTableDetails**

     Gathers the `FullLoadEndTime` timestamps of the failed tables and calculate the timerange values to analyze the CloudWatch logs.
   + **analyzeCloudWatchLogs**

     Analyzes the logs found in CloudWatch log group based on the signature error messages and returns the report to User.

1. After the execution completes, review the Outputs section for the detailed results of the execution.
   + **Output of No Table errors found**

     If there are no table errors found in the provided Database migration task or Serverless replication, the automation shows the output stating the same. 
   + **Output of No CloudWatch loggin enabled**

     If CloudWatch logging is not enabled in the provided Database migration task or Serverless replication, the automation shows the output stating the same and provides the steps to enable logging. 
   + **Log analyasis report**

     Outputs a report that identifies tables in `Table error` state from either provided Database migration task or Serverless replication, differentiating between error types, listing the error messages encountered, and providing targeted remediation steps and suggestions for each identified table.

 **References** 

Systems Manager Automation
+ [Run this Automation (console)](https://console.aws.amazon.com/systems-manager/documents/AWSSupport-TroubleshootDMSTableErrors/description)
+ [Run an automation](https://docs.aws.amazon.com//systems-manager/latest/userguide/automation-working-executing.html)
+ [Setting up an Automation](https://docs.aws.amazon.com//systems-manager/latest/userguide/automation-setup.html)
+ [Support Automation Workflows landing page](https://aws.amazon.com/premiumsupport/technology/saw/)

# `AWSSupport-TroubleshootDMSEndpointConnection`
<a name="automation-awssupport-troubleshootdmsendpointconnection"></a>

 **Description** 

The **AWSSupport-TroubleshootDMSEndpointConnection** runbook helps diagnose and troubleshoot connectivity issues between AWS Database Migration Service replication instances and AWS DMS endpoints. The automation uses [Reachability Analyzer](https://docs.aws.amazon.com/vpc/latest/reachability/what-is-reachability-analyzer.html) checks to test network connectivity and analyzes the network configuration to identify potential connectivity problems that could prevent successful AWS DMS migrations.

**Important**  
You must have tested the connectivity between the AWS DMS replication instance and endpoint using the AWS DMS console or API before running this runbook. If you haven't tested the connection, please do so first, otherwise you may need to rerun this runbook. Both the AWS DMS replication instance and endpoint must be in an available state for accurate connectivity testing.

**Important**  
This runbook creates and invokes AWS Lambda functions, which will incur Lambda charges. Each Reachability Analyzer analysis run also incurs charges. For pricing details, see the [Amazon VPC Pricing](https://aws.amazon.com/vpc/pricing/) page under the Network Analysis tab and [AWS Lambda Pricing](https://aws.amazon.com/lambda/pricing/).

 **How does it work?** 

The runbook performs a systematic analysis of AWS DMS connectivity through the following phases:

**Phase 1: Resource Validation and Prerequisites**
+ **Endpoint Validation**: Verifies the AWS DMS endpoint exists, retrieves its configuration (server name, port, engine type), and confirms the database engine is supported for troubleshooting.
+ **Connection Test Status**: Retrieves the current connection test status between the replication instance and endpoint using the AWS DMS `DescribeConnections` API, including any failure messages from previous test attempts.
+ **Replication Instance Analysis**: Gathers network configuration details including Amazon VPC ID, subnet IDs, security group IDs, and identifies the associated Elastic Network Interface (ENI) for the replication instance.

**Phase 2: DNS Resolution and Network Path Discovery**
+ **Amazon VPC-based DNS Resolution**: Creates a temporary Lambda function within the same Amazon VPC as the replication instance to resolve the endpoint hostname to its IP address from within the Amazon VPC context, ensuring accurate private DNS resolution.
+ **Target Identification**: Determines the appropriate target for Reachability Analyzer based on whether the endpoint is within the same Amazon VPC (uses ENI) or external (uses resolved IP address).
+ **IPv6 Compatibility Check**: Validates that resolved addresses are IPv4, as Reachability Analyzer does not support IPv6 addresses.

**Phase 3: Comprehensive Network Path Analysis**
+ **Reachability Analyzer Execution**: Creates a Network Insights Path from the replication instance ENI to the target (endpoint ENI or IP address) and executes a comprehensive analysis to test TCP connectivity on the specified port.
+ **Multi-layer Network Analysis**: Examines the complete network path including route tables, security groups, network ACLs, internet gateways, NAT gateways, Amazon VPC peering connections, and transit gateways to identify connectivity barriers.
+ **Detailed Explanation Generation**: For failed connectivity, provides specific explanations for each network component that blocks traffic, including exact rule numbers, CIDR blocks, port ranges, and protocol restrictions.

**Phase 4: Report Generation and Resource Cleanup**
+ **Comprehensive Reporting**: Generates a detailed report containing connection test summary, network path analysis results, and specific failure explanations with remediation guidance.
+ **Resource Management**: Automatically cleans up created resources (Lambda function, IAM roles, Network Insights Paths) unless the PersistReachabilityAnalyzerResults parameter is set to retain analysis results for further investigation.
+ **Error Handling**: Provides specific error reports for various failure scenarios including unsupported database engines, missing resources, DNS resolution failures, and permission issues.

The runbook supports troubleshooting connectivity for multiple database engines including Amazon Aurora, Amazon DocumentDB, Amazon DynamoDB, Amazon Neptune, Amazon Redshift, Amazon S3, Azure SQL Database, DB2, MySQL, Oracle, PostgreSQL, SQL Server, and many others.

 [Run this Automation (console)](https://console.aws.amazon.com/systems-manager/automation/execute/AWSSupport-TroubleshootDMSEndpointConnection) 

**Document type**

Automation

**Owner**

Amazon

**Platforms**

/

**Required IAM permissions**

The `AutomationAssumeRole` parameter requires the following actions to use the runbook successfully.
+ `cloudformation:CreateStack`
+ `cloudformation:DeleteStack`
+ `cloudformation:DescribeStacks`
+ `cloudformation:DescribeStackEvents`
+ `dms:DescribeEndpoints`
+ `dms:DescribeReplicationInstances`
+ `dms:DescribeConnections`
+ `iam:GetRole`
+ `iam:PassRole`
+ `iam:SimulatePrincipalPolicy`
+ `lambda:CreateFunction`
+ `lambda:DeleteFunction`
+ `lambda:GetFunction`
+ `lambda:InvokeFunction`
+ `lambda:ListTags`
+ `lambda:TagResource`
+ `lambda:UntagResource`
+ `lambda:UpdateFunctionCode`

 **Optional IAM permissions** 

The following permissions are only required within the `AutomationAssumeRole` if you do not provide a `LambdaRoleArn` parameter and want the automation to create the Lambda execution role for you:
+ `iam:CreateRole`
+ `iam:DeleteRole`
+ `iam:AttachRolePolicy`
+ `iam:DetachRolePolicy`
+ `iam:TagRole`
+ `iam:UntagRole`

**Important**  
 In addition to the above mentioned actions, the `AutomationAssumeRole` should have the [ AmazonVPCReachabilityAnalyzerFullAccessPolicy](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonVPCReachabilityAnalyzerFullAccessPolicy.html) as an [ attached managed policy](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html) so that the [Reachability Analyzer](https://docs.aws.amazon.com/vpc/latest/reachability/what-is-reachability-analyzer.html) tests are performed successfully. 

Example Policy: 

```
        {
            "Version": "2012-10-17"		 	 	 ,
            "Statement": [
                {
                    "Sid": "AllowDMSTroubleshootingActions",
                    "Effect": "Allow",
                    "Action": [
                        "dms:DescribeEndpoints",
                        "dms:DescribeReplicationInstances",
                        "dms:DescribeConnections",
                        "lambda:GetFunction",
                        "lambda:ListTags",
                        "lambda:CreateFunction",
                        "lambda:DeleteFunction",
                        "lambda:TagResource",
                        "lambda:UntagResource",
                        "lambda:UpdateFunctionCode",
                        "cloudformation:DescribeStacks",
                        "cloudformation:DescribeStackEvents",
                        "cloudformation:CreateStack",
                        "cloudformation:DeleteStack",
                        "iam:GetRole",
                        "iam:SimulatePrincipalPolicy",
                        "iam:CreateRole",
                        "iam:DeleteRole",
                        "iam:TagRole",
                        "iam:UntagRole"
                    ],
                    "Resource": "*"
                },
                {
                    "Sid": "AllowDMSLambdaInvocation",
                    "Effect": "Allow",
                    "Action": [
                        "lambda:InvokeFunction"
                    ],
                    "Resource": "arn:*:lambda:*:*:function:AWSSupport-TroubleshootDMSEndpointConnection-*"
                },
                {
                    "Sid": "AllowPassRoleToDMSLambda",
                    "Effect": "Allow",
                    "Action": [
                        "iam:PassRole"
                    ],
                    "Resource": "arn:*:iam::*:role/AWSSupport-TroubleshootDMSEndpointConnection-*",
                    "Condition": {
                        "StringLikeIfExists": {
                            "iam:PassedToService": "lambda.amazonaws.com"
                        }
                    }
                },
                {
                    "Sid": "AllowRolePolicyManagement",
                    "Effect": "Allow",
                    "Action": [
                        "iam:AttachRolePolicy",
                        "iam:DetachRolePolicy"
                    ],
                    "Resource": "*",
                    "Condition": {
                        "StringLikeIfExists": {
                            "iam:ResourceTag/AWSSupport-TroubleshootDMSEndpointConnection": "true"
                        }
                    }
                }
            ]
        }
```

 **Instructions** 

Follow these steps to configure the automation:

1. Navigate to [https://console.aws.amazon.com/systems-manager/documents/AWSSupport-TroubleshootDMSEndpointConnection/description](https://console.aws.amazon.com/systems-manager/documents/AWSSupport-TroubleshootDMSEndpointConnection/description) in Systems Manager under Documents.

1. Select **Execute automation.**

1. For the input parameters, enter the following:
   + **AutomationAssumeRole (Optional):**
     + Description: (Optional) The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role that allows SSM Automation to perform the actions on your behalf. If no role is specified, SSM Automation uses the permissions of the user who starts this runbook.
     + Type: `AWS::IAM::Role::Arn`
   + **DmsEndpointArn (Required)**
     + Description: (Required) The Amazon Resource Name (ARN) of the AWS Database Migration Service Endpoint.
     + Type: `String`
     + Allowed Pattern: `^arn:(aws|aws-cn|aws-us-gov|aws-iso|aws-iso-b):dms:[a-z0-9-]+:\\d{12}:endpoint:[A-Z0-9]{1,48}$`
   + **DmsReplicationInstanceArn (Required)**
     + Description: (Required) The Amazon Resource Name (ARN) of the AWS Database Migration Service Replication instance.
     + Type: `String`
     + Allowed Pattern: `^arn:(aws|aws-cn|aws-us-gov|aws-iso|aws-iso-b):dms:[a-z0-9-]+:\\d{12}:rep:[A-Z0-9]+$`
   + **PersistReachabilityAnalyzerResults (Optional)**
     + Description: (Optional) The flag informing if the results of the Network Insights Analysis execution should be kept or not.
     + Type: `Boolean`
     + Allowed Values: `[true, false]`
     +  Default: `false`
   + **LambdaRoleArn (Optional)**
     + Description: (Optional) The Amazon Resource Name (ARN) of the AWS AWS Identity and Access Management (IAM) role that allows the AWS Lambda function to access the required AWS services and resources. If no role is specified, this Systems Manager Automation will create one IAM role for Lambda in your account.
     + Type: `AWS::IAM::Role::Arn`
     +  Default: `""`
   + **Acknowledge (Required)**
     + Description: (Required) Enter `yes` to acknowledge that this runbook will create a Lambda function in your account and will create an IAM role if no `LambdaRoleArn` is provided.
     + Type: `String`
     + Allowed Pattern: `^[Yy][Ee][Ss]$`

1. Select **Execute.**

1. The automation initiates.

1. The document performs the following steps:
   + **DescribeEndpointAndCheckEngine:**

     Retrieves the AWS DMS endpoint configuration and validates if the database engine type is supported for troubleshooting. Extracts server name, port, and engine type from the endpoint configuration.
   + **BranchOnEndpointAndCheckEngineErrors:**

     Branches the automation based on any errors from the endpoint validation. If errors are found, the automation proceeds to generate an error report; otherwise, it continues with connectivity testing.
   + **GetTestConnectionStatus:**

     Retrieves the connection status and error message for the AWS DMS endpoint using the `DescribeConnections` API. This step checks if a connection test has been performed and captures any failure messages.
   + **BranchOnTestConnectionStatusErrors:**

     Branches the automation based on connection test status errors. If errors are detected, the automation generates an error report; otherwise, it proceeds with replication instance analysis.
   + **DescribeReplicationInstance:**

     Retrieves network configuration details for the AWS DMS replication instance including Amazon VPC ID, subnet IDs, security group IDs, and identifies the associated Elastic Network Interface (ENI).
   + **ValidateResourcePermissions:**

     Validates that the execution role has necessary permissions to clean up resources that will be created during the automation process.
   + **CreateDNSResolverLambda:**

     Creates a AWS CloudFormation stack containing a Lambda function deployed within the same Amazon VPC as the replication instance. This function is used to resolve DNS names to private IP addresses from within the Amazon VPC context.
   + **DescribeCloudFormationErrorFromStackEvents:**

     If the CloudFormation stack creation fails, this step describes errors from the stack events to provide detailed failure information for troubleshooting.
   + **GetDNSResolverLambdaName:**

     Retrieves the name of the DNS resolver Lambda function from the CloudFormation stack outputs for use in subsequent steps.
   + **ResolveDmsEndpoint:**

     Invokes the Lambda function to resolve the AWS DMS endpoint hostname to its IP address from within the Amazon VPC. This ensures accurate private DNS resolution and validates IPv4 compatibility.
   + **BranchOnResolveDmsEndpointErrors:**

     Branches the automation based on DNS resolution errors. If the endpoint cannot be resolved or resolves to an IPv6 address, the automation generates an error report.
   + **GetReachabilityAnalyzerTarget:**

     Identifies the appropriate target for Reachability Analyzer based on Amazon VPC configuration and endpoint location. Determines whether to use an ENI (for same-Amazon VPC endpoints) or IP address (for external endpoints) as the target.
   + **GenerateErrors:**

     Creates a comprehensive error report when failures occur in previous steps. This includes details about endpoint validation errors, connection test failures, or DNS resolution issues with specific remediation guidance.
   + **GenerateReport:**

     Creates a comprehensive troubleshooting report containing connection status, network path analysis results using Reachability Analyzer, detailed explanations of connectivity barriers, and recommended actions for resolution.
   + **CheckStackExists:**

     Checks if the CloudFormation stack was successfully created and needs to be deleted during cleanup. This step ensures proper resource management regardless of automation success or failure.
   + **DeleteDNSResolverLambda:**

     Deletes the CloudFormation stack containing the DNS resolver Lambda function and associated resources (unless `PersistReachabilityAnalyzerResults` is set to `true`), ensuring no residual resources remain after automation completion.

1. After completed, review the **Outputs** section for the detailed results of the execution:
   + **GetTestConnectionStatus.status**

     The current connection test status between the AWS DMS replication instance and endpoint (e.g., successful, failed, testing).
   + **DescribeCloudFormationErrorFromStackEvents.Events**

     If CloudFormation stack creation fails, this output contains detailed error events from the stack creation process to help diagnose infrastructure deployment issues.
   + **GenerateReport.report**

     A comprehensive troubleshooting report containing connection analysis results, Reachability Analyzer findings, network path analysis, specific connectivity barriers identified, and detailed remediation recommendations with links to relevant AWS documentation.
   + **GenerateErrors.report**

     If errors occur during the automation process, this output provides a detailed error report including specific failure reasons, affected resources, and guidance for resolving the issues before retrying the automation.

 **References** 

Systems Manager Automation
+ [Run this Automation (console)](https://console.aws.amazon.com/systems-manager/documents/AWSSupport-TroubleshootDMSEndpointConnection/description)
+ [Run an automation](https://docs.aws.amazon.com//systems-manager/latest/userguide/automation-working-executing.html)
+ [Setting up an Automation](https://docs.aws.amazon.com//systems-manager/latest/userguide/automation-setup.html)
+ [Support Automation Workflows landing page](https://aws.amazon.com/premiumsupport/technology/saw/)