

# Known issue resolution
<a name="known-issue-resolution"></a>

When troubleshooting issues with deployments using Landing Zone Accelerator on AWS, understanding its core architectural components is key. The main interfaces for this solution are the configuration files and the Core pipeline. You can find any issues arising during deployments to your environment in these interfaces.

# Problem: Configuration file issue
<a name="problem-configuration-file-issue"></a>

## Resolution
<a name="resolution"></a>

It is critical that the configuration files follow the property conventions defined. For more details, refer to the [configuration reference](https://awslabs.github.io/landing-zone-accelerator-on-aws/latest/user-guide/config/) in our [GitHub Pages website](https://awslabs.github.io/landing-zone-accelerator-on-aws/). Deviations cause an error during the **Build** stage of the pipeline. During this stage, type validation of the configuration files occurs, and variances cause the pipeline to fail.

# Problem: Configuration file not found issue
<a name="problem-configuration-file-not-found-issue"></a>

## Resolution
<a name="resolution-config-file-not-found"></a>

When S3 buckets are used for configuration files, you might receive the following error:

 `error | accelerator | ENOENT: no such file or directory, open '/codebuild/output/src437/src/s3/01/accounts-config.yaml'` 

Ensure that the configuration zip file uploaded to the S3 config bucket does not contain a top-level directory. Once the zip file has been unzipped, it should contain the solution configuration `yaml` files and other resource policy related folders at the root. Refer to the [Update the configuration files](step-3.-update-the-configuration-files.md) section for more information about the proper structure of the zip archive configuration files.

# Problem: Core pipeline failure
<a name="problem-core-pipeline-failure"></a>

## Resolution
<a name="resolution-1"></a>

To determine the cause of a deployment failure, use the following steps:

1. Sign in to the AWS Management Console and navigate to the **AWS CodePipeline** console. Select **AWSAccelerator-Pipeline** and find the pipeline stage that failed.

1. The pipeline stage has a CodeBuild project as an action provider. Select the **Details** link under the failed status indicator for the action, then choose **Link to execution details**. This opens the failed processing of the CodeBuild project.

1. Select the **Build logs** tab. This shows the output of the CodeBuild project run. Scrolling to the bottom of this output, you will see an error message. Some common examples are:
   + Misconfiguration or missing properties in the Landing Zone Accelerator on AWS configuration files. This causes the Core pipeline to fail in the **Build** stage. The configuration validator provides a specific error message indicating what caused the property validation to fail.
   + CloudFormation deployment error. This can occur in any stack. CloudFormation provides a specific error message indicating what caused the deployment failure.

**Note**  
Regardless of the failure that occurs, build logs will show the following error message at the end:  
 `[Container] Phase context status code: COMMAND_EXECUTION_ERROR Message: Error while executing command: yarn run ts-node --transpile-only cdk.ts --require-approval never $CDK_OPTIONS --config-dir $CODEBUILD_SRC_DIR_Config --partition aws --app cdk.out. Reason: exit status 1`   
This is a generic error message that CodeBuild outputs when the CDK application fails to complete successfully. When troubleshooting deployment errors, the text before this error message indicates which resource(s) failed to deploy.

# Problem: Account enrollment and environment validation failures
<a name="problem-account-enrollment-and-environment-validation-failures"></a>

When you enroll new or existing accounts in the solution, you can encounter [Core pipeline errors](problem-core-pipeline-failure.md) during the **Prepare** stage of the pipeline. Failures during this stage typically indicate an issue with enrolling the account into AWS Organizations or AWS Control Tower.

The following are potential errors you might see in **Prepare** stage build logs when enrolling accounts:

## General account enrollment failure
<a name="general-account-enrollment-failure"></a>

You might receive the following [Core pipeline error](problem-core-pipeline-failure.md) message when experiencing a general account enrollment failure:

 ` AWSAccelerator-PrepareStack | UPDATE_FAILED | Custom::CreateControlTowerAccounts | CreateCTAccounts/Resource/Default (CreateCTAccounts) Received response status [FAILED] from custom resource. Message returned: Account creation failed. Error: Accounts failed to enroll in Control Tower. Check Service Catalog Console` 

### Resolution
<a name="resolution-2"></a>

Complete the following steps when this error occurs:

1. Ensure that the prerequisites listed in [Adding an existing account](performing-administrator-tasks.md#adding-an-existing-account) are complete.

1. Sign in to the [Service Catalog](https://us-east-1.console.aws.amazon.com/servicecatalog) console from your Management account.

1. Select **Provisioned products** from the left-hand navigation pane.

1. Choose **Account** in the **Access Filter** drop-down menu.

1. The screen lists the reason provisioning failed. Select the Control Tower Account Factory product that failed provisioning. From the drop-down menu, select **Terminate**.

1. Sign in to the [AWS CloudFormation](https://us-east-1.console.aws.amazon.com/cloudformation) console.

1. Select the **Prepare** stack, which will be in the `ROLLBACK_FAILED` or `UPDATE_ROLLBACK_FAILED` state after the account enrollment failure.

1. Select **Continue update rollback** from the **Stack actions** dropdown menu. Choose **Advanced troubleshooting**. Select the resource with prefix `CreateCTAccounts*`, then choose **Continue update rollback**.

1. Await rollback completion.

1.  [Retry](https://docs.aws.amazon.com/codepipeline/latest/userguide/actions-retry.html) the **Prepare** stage of **AWSAccelerator-Pipeline**.

## Environment validation error
<a name="environment-validation-error"></a>

You might receive a [Core pipeline error](problem-core-pipeline-failure.md) message when experiencing an environment validation error. For example:

 ` AWSAccelerator-PrepareStack | UPDATE_FAILED | Custom::ValidateEnvironmentConfig | ValidateEnvironmentConfig/Resource/Default (ValidateEnvironmentConfig) Received response status [FAILED] from custom resource. Message returned: Error: AWS Control Tower has detected that the managed account <account_ID> has been removed from organization <organization_ID>. ` 

**Note**  
This error message might differ depending on the type of drift detected.

If you have made any changes to your account(s), OU(s), or managed SCPs outside of the AWS Control Tower console, the solution’s drift detection functionality likely caught these changes and caused this error. You can’t run the pipeline until you undo these changes or enroll the changed account(s) or OU(s) in AWS Control Tower.

### Resolution
<a name="resolution-3"></a>

Complete the following steps when this error occurs:

1. Ensure that all account(s), OU(s), and AWS Control Tower-managed SCPs are properly enrolled in Control Tower. For more information, see [Detect and resolve drift in AWS Control Tower](https://docs.aws.amazon.com/controltower/latest/userguide/drift.html) in the *AWS Control Tower User Guide*.

1. Sign in to the [Systems Manager Parameter Store console](https://us-east-1.console.aws.amazon.com/systems-manager/parameters) from your Management account.

1. Search for the parameter named `/accelerator/controlTower/driftDetected`.

1. If the value of this parameter is true, select **Edit** and change the parameter value to false.

1. Sign in to the [AWS CloudFormation console](https://us-east-1.console.aws.amazon.com/cloudformation).

1. Select the **Prepare** stack, which will be in the `ROLLBACK_FAILED` or `UPDATE_ROLLBACK_FAILED` state after the environment validation failure.

1. Select the **Stack actions** dropdown menu, then choose **Continue update rollback**. Select **Advanced troubleshooting**. Select the resource with prefix `ValidateEnvironmentConfig*`, then choose **Continue update rollback**.

1. Await rollback completion.

1.  [Retry](https://docs.aws.amazon.com/codepipeline/latest/userguide/actions-retry.html) the **Prepare** stage of **AWSAccelerator-Pipeline**.

# Problem: Suspended account causing enrollment or environment validation failure
<a name="problem-suspended-account-causing-enrollment-or-environment-validation-failure"></a>

After you suspend accounts from your AWS Organization, the solution environment validation feature still attempts to enroll and validate these suspended accounts in the **Prepare** stage unless you contain them within an ignored OU. You will receive a [Core pipeline error](problem-core-pipeline-failure.md) until you complete the following resolution steps.

## Resolution
<a name="resolution-4"></a>

Follow the steps in [Closing an account](performing-administrator-tasks.md#closing-an-account). The solution then ignores the suspended account.

### For AWS Control Tower-based environments
<a name="for-aws-control-tower-based-environments"></a>

If you run the Core pipeline before ignoring the account, the account might have a tainted Account Factory product associated with it. Use the following procedure to remove that resource:

1. Follow the steps in [Closing an account](performing-administrator-tasks.md#closing-an-account).

1. Sign in to the [Service Catalog console](https://us-east-1.console.aws.amazon.com/servicecatalog) from your Management account.

1. Select **Provisioned products** from the navigation menu.

1. Choose **Account** in the **Access Filter** drop-down menu.

1. Select the Control Tower Account Factory product that failed provisioning. From the drop-down menu, select **Terminate**.

# Problem: "S3 bucket name already exists" error
<a name="problem-s3-bucket-name-already-exists-error"></a>

This solution creates Amazon S3 buckets during deployment. Some of these buckets (such as those deployed along with the [Centralized logging](centralized-logging.md) infrastructure) are mandatory. Others (such as the report destination buckets created for Cost and Usage Reports and AWS Audit Manager) deploy based on your defined configuration.

**Note**  
By default, Amazon S3 buckets deployed by CloudFormation have a [deletion policy](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-attribute-deletionpolicy.html) that’s set to retain the resources. Landing Zone Accelerator on AWS uses this default policy so that you can deactivate a service that the solution previously managed and still preserve your data stored in Amazon S3.

Scenarios that can cause this error include:

1. If you deactivate a solution-managed service and then reactivate it later.

1. If you uninstall the solution and then reinstall it later into the same environment.

These errors result from a standard naming convention for Amazon S3 buckets that this solution deploys. Because Amazon S3 bucket names must be globally unique, you receive an error message if the previous Amazon S3 buckets were not deleted. The following is an example, with `aws-accelerator-<SERVICE>-<ACCOUNT_ID>-<REGION> ` representing the bucket name:

 `AWSAccelerator-<STACK_NAME>- <ACCOUNT_ID>-<REGION> ` failed: Error: The stack named AWSAccelerator-` <STACK_NAME>- <ACCOUNT_ID>-<REGION> ` failed creation, it may need to be manually deleted from the AWS console: ROLLBACK\$1COMPLETE: aws-accelerator-` <SERVICE>- <ACCOUNT_ID>-<REGION> ` already exists.

## Resolution
<a name="resolution-5"></a>

Complete the following steps when this error occurs:

1. If you want to retain the data, make a local copy or [copy the data to another Amazon S3 bucket](https://aws.amazon.com/premiumsupport/knowledge-center/move-objects-s3-bucket/) in your account.

1. Delete the solution-created Amazon S3 bucket that’s causing the conflict.

1.  [Retry](https://docs.aws.amazon.com/codepipeline/latest/userguide/actions-retry.html) the failing **AWSAccelerator-Pipeline** stage.

# Problem: "ValidationError: Stack <stack-name> cannot be deleted while TerminationProtection is enabled" error
<a name="problem-validationerror"></a>

Depending on your deployment, you might choose to remove an existing solution-provisioned CloudFormation stack. Solution-provisioned stacks have termination protection activated by default. If you attempt to delete a stack with termination protection activated, the deletion fails. The stack and its status remain unchanged. You might receive a [Core pipeline error](problem-core-pipeline-failure.md) message like the following, with `AWSAccelerator-<STACK_NAME>- <ACCOUNT_ID>-<REGION> ` representing the stack name:

 `AWSAccelerator-<STACK_NAME>- <ACCOUNT_ID>-<REGION> failed: Error [ValidationError]: Stack <STACK_NAME>- <ACCOUNT_ID>-<REGION>] cannot be deleted while TerminationProtection is enabled` 

## Resolution
<a name="resolution-6"></a>

## Option 1: Use the AWS Management Console
<a name="option-1-use-the-aws-management-console"></a>

1. Deactivate termination protection on the stack. For more information, see [Protecting a stack from being Deleted](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-protect-stacks.html) in the *AWS CloudFormation User Guide*.

1. Attempt deletion again.

## Option 2: Use the AWS Command Line Interface
<a name="option-2-use-the-aws-command-line-interface"></a>

1. Deactivate termination protection on the stack by running the [update-termination-protection](https://docs.aws.amazon.com/cli/latest/reference/cloudformation/update-termination-protection.html) command with the CLI:

   ```
   $ aws cloudformation update-termination-protection --stack-name <stack-name> --no-enable-termination-protection
   ```

1. Attempt deletion again.

# Problem: GitHub personal access token expired
<a name="problem-github-personal-access-token-expired"></a>

The solution uses a GitHub personal access token to access the Landing Zone Accelerator on AWS code repository. If you have set an expiration date on the access token, the token’s privilege is revoked when it expires. You will see an error when trying to run the **Source** stage and action of the [Installer](deployment-pipelines.md#awsaccelerator-installerstack) or [Core](deployment-pipelines.md#awsaccelerator-pipelinestack) pipeline. For example:

 `Could not access the GitHub repository: "landing-zone-accelerator-on-aws". The access token might be invalid or has been revoked. Edit the pipeline to reconnect with GitHub.` 

## Resolution
<a name="resolution-7"></a>

## Option 1 (releases as of version 1.3.1): Use the Landing Zone Accelerator automated GitHub token update functionality
<a name="option-1-releases-as-of-version-1.3.1-use-the-landing-zone-accelerator-automated-github-token-update-functionality"></a>

1.  [Create a new GitHub personal access token](prerequisites.md#create-a-github-personal-access-token-and-store-in-secrets-manager) and update the secret value in AWS Secrets Manager.
**Note**  
Saving the updated value in Secrets Manager will invoke the `UpdatePipelineGithubToken` Lambda function, which automates the process of updating the GitHub Personal Access Token in CodePipeline.

1. Retry the failed **Source** stage of the affected pipeline.

## Option 2 : (releases before version 1.3.1): Update access token in AWS CodePipeline
<a name="option-2-releases-before-version-1.3.1-update-access-token-in-aws-codepipeline"></a>

1. Create a new GitHub personal access token and update the pipeline structure with the new token. For step-by-step instructions, see [Configure authentication](https://docs.aws.amazon.com/codepipeline/latest/userguide/appendix-github-oauth.html#action-reference-GitHub) (Github version 1 source actions) in the *AWS CodePipeline User Guide*.

1. Retry the failed **Source** stage of the affected pipeline.

# Problem: Couldn’t find or create service linked role
<a name="problem-could-not-find-or-create-service-linked-role"></a>

We updated this solution to make service linked role creation idempotent. When you create a new resource, the solution checks for existing service linked role. If no service linked role exists, the solution creates one. During cleanup, the `AWS::IAM::ServiceLinkedRole` resource might have been removed successfully, which can cause issues.

Example event from CodeBuild:

 `AWSAccelerator-OrganizationsStack-<account>-<region> | …​ | DELETE_IN_PROGRESS | AWS::IAM::ServiceLinkedRole | FirewallManagerServiceLinkedRole` 

 `AWSAccelerator-OrganizationsStack-<account>-<region> | …​ | DELETE_COMPLETE | AWS::IAM::ServiceLinkedRole | FirewallManagerServiceLinkedRole` 

## Resolution
<a name="resolution-8"></a>

 [Manually](https://docs.aws.amazon.com/codepipeline/latest/userguide/pipelines-rerun-manually.html) release the pipeline again. The service linked role will run on every pipeline. If no service linked role exists, the solution creates a new one in the account.

# Problem: "The 'link' command was removed" error
<a name="problem-the-link-command-was-removed-error"></a>

We updated this solution to the newest version of `lerna` which has deprecated the "link" command. CodePipeline stages uses the link command as part of the build process for multiple stages in the pipeline.

## Resolution
<a name="resolution-9"></a>

The latest installer template starting with v1.5.0 removed this command. Upgrades after v1.5.0 will require an update to the installer template to capture all new changes. Follow the [Update the solution](update-the-solution.md) steps to update the installer template and the pipeline will run again resolving the error.

# Problem: "AWSCloudFormationStackSetExecutionRole already exists" error
<a name="problem-the-awscloudformationstacksetexecutionrole-already-exists-error"></a>

When creating AWS CloudFormation [StackSets](https://awslabs.github.io/landing-zone-accelerator-on-aws/latest/typedocs/latest/classes/_aws_accelerator_config.CloudFormationStackSetConfig.html) using Landing Zone Accelerator on AWS, the solution attempts to create IAM roles required for deploying StackSets with [self-managed permissions](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/stacksets-prereqs-self-managed.html#prereqs-self-managed-permissions). Specifically, the two required roles are:
+  **AWSCloudFormationStackSetAdministrationRole** - This role is deployed to the Management account.
+  **AWSCloudFormationStackSetExecutionRole** - This role is deployed to all accounts.

When deploying Landing Zone Accelerator on AWS to an environment where these roles already exist, the pipeline will fail with the `AWSCloudFormationStackSetAdministrationRole already exists` or `AWSCloudFormationStackSetExecutionRole already exists` error.

## Resolution
<a name="resolution-10"></a>

1.  [Delete](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_manage_delete.html#roles-managingrole-deleting-console) the `AWSCloudFormationStackSetAdministrationRole` IAM role from the Management account.

1. Delete the `AWSCloudFormationStackSetExecutionRole` IAM role from all accounts.

1. Retry the failed pipeline stage.

# Problem: AWS::CodeStarNotifications::NotificationRule fails to create during deployment
<a name="problem-notification-rule-fails-to-exist-error"></a>

When deploying a solution that uses the `AWS::CodeStarNotifications::NotificationRule` resource, the resource creation may fail with the error "Invalid request provided: AWS::CodeStarNotifications::NotificationRule". The `AWS::CodeStarNotifications::NotificationRule` resource is attempting to create a notification rule, but the necessary service role has not been fully created yet. This is due to a timing issue, where the resource is trying to create the rule before the service role is ready.

## Resolution
<a name="resolution-notification-rule"></a>

If you encounter this issue, you can try manually retrying the failed resource creation. After approximately 15 minutes, the service role should be fully created, and the `AWS::CodeStarNotifications::NotificationRule` resource should be able to create the notification rule successfully.