

# Migrate from non-MSK Apache Kafka clusters to Amazon MSK Express brokers
<a name="msk-replicator-migrate-external"></a>

You can use MSK Replicator to migrate Apache Kafka workloads from self-managed environments to Amazon MSK Provisioned clusters with Express brokers. MSK Replicator supports data migration from Kafka deployments (Kafka version 2.8.1 or later) that have SASL/SCRAM authentication enabled.

**Note**  
SASL/SCRAM authentication is required only for MSK Replicator to connect to your self-managed Kafka cluster. Your client applications can continue using their existing authentication mechanisms.

**Prerequisites**  
Before you begin, ensure you have the following:

1. Source Apache Kafka cluster running version 2.8.1 or later

1. SASL/SCRAM authentication enabled on source cluster

1. SSL encryption configured on source cluster

1. Network connectivity via AWS Site-to-Site VPN or AWS Direct Connect

1. VPC subnets configured for Secrets Manager access

For detailed instructions, see [Set up prerequisites for MSK Replicator with self-managed Apache Kafka clusters](msk-replicator-external-prereqs.md).

**Step 1: Create an Amazon MSK Express cluster**  
Create an MSK Provisioned cluster with Express brokers with IAM authentication enabled. Minimum three brokers across three AZs. See [Prepare the target cluster](msk-replicator-prepare-clusters.md#msk-replicator-prepare-target).

**Step 2: Create an IAM execution role**  
Attach `AWSMSKReplicatorExecutionRole` and `AWSSecretsManagerClientReadOnlyAccess` managed policies. Configure trust policy for `kafka.amazonaws.com`. See [Set up prerequisites for MSK Replicator with self-managed Apache Kafka clusters](msk-replicator-external-prereqs.md).

**Step 3: Configure SASL/SCRAM and SSL on self-managed cluster**  
Create dedicated SCRAM user with required ACL permissions. Configure SSL certificates. See [Set up prerequisites for MSK Replicator with self-managed Apache Kafka clusters](msk-replicator-external-prereqs.md).

**Step 4: Store credentials in AWS Secrets Manager**  
Create secret with `username`, `password`, and `certificate` key-value pairs. See [Set up prerequisites for MSK Replicator with self-managed Apache Kafka clusters](msk-replicator-external-prereqs.md).

**Step 5: Create the Replicator**  
Use `CreateReplicator` API with `EARLIEST` starting position, Identical topic name replication, and `synchroniseConsumerGroupOffsets` set to `true`. If you plan to set up bidirectional replication for rollback capability (Step 6), also set `consumerGroupOffsetSyncMode` to `ENHANCED` on both the forward and reverse Replicators. Allow approximately 30 minutes for the Replicator to reach RUNNING status. See [CreateReplicator API examples for self-managed Kafka clusters](msk-replicator-external-api-examples.md).

**Step 6: (Optional) Set up bidirectional replication**  
Create a reverse Replicator from the MSK Express cluster back to the self-managed cluster for rollback capabilities. See [CreateReplicator API examples for self-managed Kafka clusters](msk-replicator-external-api-examples.md).

**Step 7: Monitor replication progress**  
Monitor the following metrics:
+ `MessageLag` (should reach 0)
+ `ReplicationLatency`
+ `ConsumerGroupOffsetSyncFailure` (should be 0)
+ `ConsumerGroupCount`
+ `OffsetLag (MSK Cluster)` and `OffsetLag (Non-MSK Cluster)`

For more information, see [Monitor replication](msk-replicator-monitor.md).

**Step 8: Migrate applications**  
Follow these steps to migrate your applications:

1. Stop producers writing to self-managed cluster

1. Reconfigure producers to MSK Express cluster with IAM authentication

1. Monitor `MessageLag` until it reaches 0

1. Stop consumers on self-managed cluster

1. Reconfigure consumers to MSK Express cluster

**Step 9: (Optional) Roll back to self-managed cluster**  
If bidirectional replication was configured, you can reverse the migration steps to roll back to the self-managed cluster. The reverse Replicator (MSK Express → External) will have been keeping the self-managed cluster in sync, so consumers can be redirected back without data loss.