

# Monitoring with Elastic Disaster Recovery
Monitoring

## Logging AWS Elastic Disaster Recovery API calls using AWS CloudTrail


AWS Elastic Disaster Recovery is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, role, or an AWS service in AWS Elastic Disaster Recovery. CloudTrail captures all API calls for AWS Elastic Disaster Recovery as events. The calls captured include calls from the AWS Elastic Disaster Recovery console and code calls to the AWS Elastic Disaster Recovery API operations. If you create a trail, you can activate continuous delivery of CloudTrail events to an Amazon S3 bucket, including events for AWS Elastic Disaster Recovery. If you don't configure a trail, you can still view the most recent events in the CloudTrail console in **Event history**. Using the information collected by CloudTrail, you can determine the request that was made to AWS Elastic Disaster Recovery, the IP address from which the request was made, who made the request, when it was made, and additional details. 

To learn more about CloudTrail, see the[AWS CloudTrail User Guide](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html). 

### AWS Elastic Disaster Recovery information in CloudTrail


CloudTrail is activated on your AWS account when you create the account. When activity occurs in AWS Elastic Disaster Recovery, that activity is recorded in a CloudTrail event along with other AWS service events in **Event history**. You can view, search, and download recent events in your AWS account. For more information, see[Viewing events with CloudTrail Event history](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/view-cloudtrail-events.html). 

For an ongoing record of events in your AWS account, including events for AWS Elastic Disaster Recovery, create a trail. A *trail* enables CloudTrail to deliver log files to an Amazon S3 bucket. By default, when you create a trail in the console, the trail applies to all AWS Regions. The trail logs events from all Regions in the AWS partition and delivers the log files to the Amazon S3 bucket that you specify. Additionally, you can configure other AWS services to further analyze and act upon the event data collected in CloudTrail logs. For more information, see the following: 
+  [Overview for creating a trail ](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-create-and-update-a-trail.html) 
+  [CloudTrail supported services and integrations ](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-aws-service-specific-topics.html) 
+  [Configuring Amazon SNS notifications for CloudTrail ](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/configure-sns-notifications-for-cloudtrail.html) 
+  [Receiving CloudTrail log files from multiple regions ](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/receive-cloudtrail-log-files-from-multiple-regions.html) and [Receiving CloudTrail log files from multiple accounts ](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-receive-logs-from-multiple-accounts.html) 

All AWS Elastic Disaster Recovery actions are logged by CloudTrail and are documented in the AWS Elastic Disaster Recovery API. For example, calls to the `DescribeSourceServers` action to generate entries in the CloudTrail log files. 

Every event or log entry contains information about who generated the request. The identity information helps you determine the following: 
+ Whether the request was made with root or AWS Identity and Access Management (IAM) user credentials.
+ Whether the request was made with temporary security credentials for a role or federated user. 
+ Whether the request was made by another AWS service.

For more information, see the[CloudTrail userIdentity element](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-event-reference-user-identity.html). 

### Understanding AWS Elastic Disaster Recovery log file entries


A trail is a configuration that allows delivery of events as log files to an Amazon S3 bucket that you specify. CloudTrail log files contain one or more log entries. An event represents a single request from any source and includes information about the requested action, the date and time of the action, request parameters, and so on. CloudTrail log files aren't an ordered stack trace of the public API calls, so they don't appear in any specific order. 

The following example shows a CloudTrail log entry that demonstrates the DescribeSourceServers. 

```
{
    "eventVersion": "1.08",
    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "AAAAAAAAAAAAAAAAAAA",
        "arn": "arn:aws:sts::1234567890:assumed-role/Admin/user-Isengard",
        "accountId": "1234567890",
        "accessKeyId": "BBBBBBBBBBBBBBBBBBBB",
        "sessionContext": {
            "sessionIssuer": {
                "type": "Role",
                "principalId": "AAAAAAAAAAAAAAAAAAA",
                "arn": "arn:aws:iam::1234567890:role/Admin",
                "accountId": "1234567890",
                "userName": "Admin"
            },
            "webIdFederationData": {},
            "attributes": {
                "creationDate": "2021-10-20T14:19:17Z",
                "mfaAuthenticated": "false"
            }
        }
    },
    "eventTime": "2021-10-20T14:19:59Z",
    "eventSource": "drs.amazonaws.com",
    "eventName": "DescribeSourceServers",
    "awsRegion": "eu-west-1",
    "sourceIPAddress": "54.240.197.234",
    "userAgent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81
    Safari/537.36",
    "requestParameters": {
        "maxResults": 1000,
        "filters": {}
    },
    "responseElements": null,
    "requestID": "d7618669-db08-4b53-bf6e-8a2cd57a677d",
    "eventID": "436c17a7-3a54-4f4e-815d-4d980339744e",
    "readOnly": true,
    "eventType": "AwsApiCall",
    "managementEvent": true,
    "recipientAccountId": "1234567890",
    "eventCategory": "Management"
}
```

# Amazon CloudWatch Metrics for DRS
CloudWatch Metrics for DRS

The following are CloudWatch metrics for DRS:
+  **TotalSourceServerCount** - number of source servers 
+  **LagDuration** - the age of the latest consistent snapshot, in seconds 
+  **Backlog** - the amount of data yet to be synced, in bytes. 
+  **DurationSinceLastSuccessfulRecoveryLaunch** - the amount of time that has passed since the last Drill or Recovery instance launch in seconds. 
+  **ElapsedReplicationDuration** - the cumulative amount of time this server has been replicating for in seconds. 

# Alarm events and EventBridge
Alarm events and EventBridge



## Sample events for Elastic Disaster Recovery
Sample events for Elastic Disaster Recovery

The following are sample events for Elastic Disaster Recovery:

### Source server data replication status
Source server data replication status

These events are triggered when source servers' data replication state changes from Stalled (replication not functioning properly) and not stalled (replication is functioning as expected). 

 **STALLED** 

```
{
	"version": "0",
	"id": "9da9af57-9253-4406-87cb-7cc400e43465",
	"detail-type": "DRS Source Server Data Replication Stalled Change",
	"source": "aws.drs",
	"account": "111122223333",
	"time": "2016-08-22T20:12:19Z",
	"region": "us-west-2",
	"resources": [
		"arn:aws:drs:us-west-2:111122223333:source-server/s-12345678901234567"
	],
	"detail": {
		"state": "STALLED"
	}
}
```

 **NOT\$1STALLED** 

```
{
	"version": "0",
	"id": "9da9af57-9253-4406-87cb-7cc400e43465",
	"detail-type": "DRS Source Server Data Replication Stalled Change",
	"source": "aws.drs",
	"account": "111122223333",
	"time": "2016-08-22T20:12:19Z",
	"region": "us-west-2",
	"resources": [
		"arn:aws:drs:us-west-2:111122223333:source-server/s-12345678901234567"
	],
	"detail": {
		"state": "NOT_STALLED"
	}
}
```



### Source server launch result
Source server launch result

These events are triggered when a drill or recovery instance is launched for a source server and indicate whether the launch succeeded or failed. 

 **RECOVERY\$1LAUNCH\$1SUCCEEDED** 

```
{
    "version": "0",
    "id": "9da9af57-9253-4406-87cb-7cc400e43465",
    "detail-type": "DRS Source Server Launch Result",
    "source": "aws.drs",
    "account": "111122223333",
    "time": "2016-08-22T20:12:19Z",
    "region": "us-west-2",
    "resources": [
        "arn:aws:drs:us-west-2:111122223333:source-server/s-12345678901234567"
    ],
    "detail": {
        "state": "RECOVERY_LAUNCH_SUCCEEDED",
        "job-id": "drsjob-04ca7d0d3fb6afa3e",
        "is-drill": "FALSE"
    }
}
```

 **RECOVERY\$1LAUNCH\$1FAILED** 

```
{
    "version": "0",
    "id": "9da9af57-9253-4406-87cb-7cc400e43465",
    "detail-type": "DRS Source Server Launch Result",
    "source": "aws.drs",
    "account": "111122223333",
    "time": "2016-08-22T20:12:19Z",
    "region": "us-west-2",
    "resources": [
        "arn:aws:drs:us-west-2:111122223333:source-server/s-12345678901234567"
    ],
    "detail": {
        "state": "RECOVERY_LAUNCH_FAILED",
        "job-id": "drsjob-04ca7d0d3fb6afa3e",
        "is-drill": "FALSE"
    }
}
```

### Recovery instance failback State Change
Recovery instance failback State Changes

These events are triggered as part of the failback process and indicate if failback is in progress, completed or failed. 

 **FAILBACK\$1IN\$1PROGRESS** 

```
{
    "version": "0",
    "id": "9da9af57-9253-4406-87cb-7cc400e43465",
    "detail-type": "DRS Recovery Instance Failback State Change",
    "source": "aws.drs",
    "account": "111122223333",
    "time": "2016-08-22T20:12:19Z",
    "region": "us-west-2",
    "resources": [
        "arn:aws:drs:us-west-2:111122223333:recovery-instance/ri-12345678901234567"
    ],
    "detail": {
    "state": "FAILBACK_IN_PROGRESS"
    }
}
```

 **FAILBACK\$1COMPLETED** 

```
{
    "version": "0",
    "id": "9da9af57-9253-4406-87cb-7cc400e43465",
    "detail-type": "DRS Recovery Instance Failback State Change",
    "source": "aws.drs",
    "account": "111122223333",
    "time": "2016-08-22T20:12:19Z",
    "region": "us-west-2",
    "resources": [
        "arn:aws:drs:us-west-2:111122223333:recovery-instance/ri-12345678901234567"
    ],
    "detail": {
        "state": "FAILBACK_COMPLETED"
    }
}
```

 **FAILBACK\$1ERROR** 

```
{
    "version": "0",
    "id": "9da9af57-9253-4406-87cb-7cc400e43465",
    "detail-type": "DRS Recovery Instance Failback State Change",
    "source": "aws.drs",
    "account": "111122223333",
    "time": "2016-08-22T20:12:19Z",
    "region": "us-west-2",
    "resources": [
        "arn:aws:drs:us-west-2:111122223333:recovery-instance/ri-12345678901234567"
    ],
    "detail": {
    "state": "FAILBACK_ERROR"
    }
}
```

### PIT Snapshot Taken
PIT Snapshot Taken

This event is triggered whenever a point in time snapshot is taken and includes its identifiers. 

 **PIT Snapshot Taken** 

```
{
    "account": "111122223333",
    "detail": {
        "DrsSnapshotID": "112233",
        "EbsSnapshotIDs": "445566,778899"
    },
    "detail-type": "DRS PIT Snapshot Taken",
    "id": "9da9af57-9253-4406-87cb-7cc400e43465",
    "region": "us-west-2",
    "resources": [
        "arn:aws:drs:us-west-2:111122223333:source-server/s-12345678901234567"
    ],
    "source": "aws.drs",
    "time": "2016-08-22T20:12:19Z",
    "version": "0"
}
```

## Registering event rules


You create EventBridge rules that capture events coming from your Elastic Disaster Recovery resources.

**Note**  
When you use the AWS Management Console to create an event rule, the console automatically adds the IAM permissions necessary to grant EventBridge Event permissions to call your desired target type. If you are creating an event rule using the AWS CLI, you must grant permissions explicitly. For more information, see [Event Patterns ](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-event-patterns.html) in the*Amazon EventBridge User Guide*. 

**To create Amazon EventBridge rules**

1. Open the Amazon EventBridge console at [https://console.aws.amazon.com/events/](https://console.aws.amazon.com/events/).

1. Using the following values, create an EventBridge rule that captures events coming from Elastic Disaster Recovery resources: 
   + For **Rule type**, choose **Rule with an event pattern**. 
   + For **Event source**, choose **Other**. 
   + For **Event pattern**, choose **Custom patterns (JSON editor)**, and paste one of the following event pattern examples into the text area: 
     + To catch all Elastic Disaster Recovery events:

       ```
       {
       	"source": [
       		"aws.drs"
       	]
       }
       ```
     + To catch all Recovery instance failback state changes:

       ```
       {
       	"detail-type": [
       		"DRS Recovery Instance Failback State Change"
       	],
       	"source": [
       		"aws.drs"
       	]
       }
       ```
     + To catch all events relating to a given Source server:

       ```
       {
       	"source": [
       		"aws.drs"
       	],
       	"resources": [
       		"arn:aws:drs:us-west-2:111122223333:source-server/s-12345678901234567"
       	]
       }
       ```
   + For **Target types**, choose**AWS service**, and for **Select a target** choose your desired target. 

   For details about creating rules, see [Creating Amazon EventBridge rules that react to events ](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-create-rule.html) in the**Amazon EventBridge User Guide**. 