

# MirrorMaker 2.0 (MM2)
<a name="mirrormaker-2.0-mm2"></a>

**Note**  
This whitepaper is only meant to be used as a reference for deploying MirrorMaker 2.0 (MM2) on Kafka Connect to migrate data between multiple Amazon MSK clusters. AWS does not offer any support for it.

 MirrorMaker 2.0 (MM2) is a multi-cluster data replication engine based on the Kafka Connect framework. MM2 is a combination of an Apache Kafka source connector and a sink connector. You can use a single MM2 cluster to migrate data between multiple clusters. MM2 automatically detects new topics and partitions, while also ensuring the topic configurations are synced between clusters. 

 MM2 is available as part of the Apache Kafka 2.4.0 release and supersedes all capabilities of MM1. AWS recommends using MM2. 

## MM2 remote topics
<a name="mm2-remote-topics"></a>

 Replication policies help consumers perform migrations between self-managed Apache Kafka clusters and Amazon MSK clusters with no downtime. MM2 comes with a default replication policy that can be customized by providing a custom replication policy class. 

 Remote topics in MM2 are replicated topics on the target cluster. These topics reference the source cluster using a naming convention as shown in the following figure. 

![\[Diagram showing Remote topics\]](http://docs.aws.amazon.com/whitepapers/latest/amazon-msk-migration-guide/images/remote-topics.png)


 For the default replication policy, *topicA* is the source topic. *Source* is a source cluster alias used in the configuration properties file and the remote topic name is *source.topicA*. This distinguishes between a replicated remote topic and a topic created on the destination or target cluster. 

 MM2 ensures ordering in partitions when it replicates the topic from source cluster to the remote topic in the destination cluster. The following diagram shows an example of remote topics created on source and destination cluster when bi-directional replication is configured in MM2. 

## MM2 internal topics
<a name="mm2-internal-topics"></a>

 MM2 uses the following internal topics for replication purposes: 

 **Heartbeat topic**: Emitted from the source cluster and replicated to demonstrate connectivity through connectors. This can be used by downstream consumers to verify that the connector is running and the corresponding source cluster is available. Messages in this topic contain information on the source cluster, target cluster, and timestamp when the heartbeat was created. 

 **Checkpoint topic**: Emitted in the target cluster by the connector and contains consumer offsets for each consumer group in the source cluster. The connector will periodically query the source cluster for all committed offsets of consumer groups (except for replicated topics) and emit a message to this topic. Information in this message includes the consumer group id, topic, partition, upstream offset, downstream offset, metadata, and timestamp. Consumers use the checkpoint topic via [https://amazonmsk-labs.workshop.aws/en/migration/mirrormaker2/migrationsteps.html](https://amazonmsk-labs.workshop.aws/en/migration/mirrormaker2/migrationsteps.html) or [https://amazonmsk-labs.workshop.aws/en/migration/mirrormaker2/migrationsteps.html](https://amazonmsk-labs.workshop.aws/en/migration/mirrormaker2/migrationsteps.html) class to get the replicated offsets in the target Apache Kafka cluster. 

 **Offset sync topics:** Encodes cluster-to-cluster offset mapping for each replicated topic-partition. Messages in this topic contain topic, partition, upstream offset, and downstream offset. 

## MM2 Connectors
<a name="mm2-connectors"></a>

 MM2 has the following connectors for enabling complex flows between multiple Apache Kafka clusters and across data centers via existing Kafka Connect clusters. 
+  **`MirrorSourceConnector`** replicates a set of topics from a single source cluster into the primary cluster. 
+  **`MirrorCheckpointConnector`** emits consumer offset checkpoints and syncs the offset with `__consumer_offsets` (as of Apache Kafka 2.7.0). 
+  **`MirrorHeartbeatConnector`** emits heartbeats. 

## MM2 Deployment Methods
<a name="mm2-deployment-methods"></a>

 MirrorMaker 2.0 can be deployed in several modes: 
+  [As a dedicated MirrorMaker 2.0 cluster](#deploying-a-dedicated-mm2-cluster) 
+  [As connectors in a distributed Kafka Connect cluster](#deploying-mm2-on-a-kafka-connect-cluster) 
+ [In legacy mode](#in-legacy-mode) 

## Deploying a dedicated MM2 cluster
<a name="deploying-a-dedicated-mm2-cluster"></a>

The *connect-mirror-maker.sh* script sets up the `MirrorSourceConnector`, `MirrorCheckpointConnector`, and `MirrorHeartbeatConnector` connectors based on the provided MM2 properties file. See the following example file. 

```
#Apache Kafka clusters
clusters = #comma separated list of Apache Kafka cluster aliases
source.bootstrap.servers = apache-kafka-source:9092
target.bootstrap.servers = apache-kafka-target:9092

source -> target.enabled = true

#Source and target cluster configuration
source.config.storage.replication.factor = 1
target.config.storage.replication.factor = 1
source.offset.storage.replication.factor = 1
target.offset.storage.replication.factor = 1

#Mirror Maker configuration
offset-sync.topic.replication.factor = 1
heartbeat.topic.replication.factor = 1
checkpoint.topic.replication.factor = 1

topics = .*
groups = .*

tasks.max = 1
replication.factor = 1
refresh.topics.enabled = true
sync.topic.configs.enabled = true

#Enable heartbeats and checkpoints
source->target.emit.heartbeats.enabled = true
source->target.emit.checkpoints.enabled = true
```

 Ensure MM2 has successfully connected to the source and target clusters using the `kafka-topics.sh` script to list and compare the topics created on both clusters. See the following example. 

```
#Source Topics
./kafka-topics.sh --bootstrap-server <source bootstrap string> --list
__consumer_offsets
heartbeats
mm2-configs.target.internal
mm2-offset-syncs.target.internal
mm2-offsets.target.internal
mm2-status.target.internal

#Target Topics
./kafka-topics.sh --bootstrap-server <target bootstrap string> --list
__consumer_offsets
heartbeats
mm2-configs.source.internal
mm2-offsets.source.internal
mm2-status.source.internal
source.checkpoints.internal
source.heartbeats
```

## Deploying MM2 on a Kafka Connect cluster
<a name="deploying-mm2-on-a-kafka-connect-cluster"></a>

 Customers that already have an existing Kafka Connect cluster running on EC2 instances or containers can run MM2 on the same cluster by starting the `MirrorSourceConnector`, `MirrorCheckpointConnector`, and `MirrorHeartbeatConnector` connectors. 

 The AWS provided self-service [Amazon MSK workshop](https://amazonmsk-labs.workshop.aws/en/migration/mirrormaker2/setupmm2.html) details all of the necessary steps required to set up MM2 on a Kafka Connect cluster. See the following figure for an overview. 

![\[Diagram showing MM2 on Kafka Connect\]](http://docs.aws.amazon.com/whitepapers/latest/amazon-msk-migration-guide/images/mm2-on-kafka-connect.png)


## In legacy mode
<a name="in-legacy-mode"></a>

 After legacy MirrorMaker is deprecated, the existing `./bin/kafka-mirror-maker.sh` scripts will be updated to run MM2 in legacy mode: 

```
$ ./bin/kafka-mirror-maker.sh --consumer consumer.properties 
--producer producer.properties
```