

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 提醒和监控
<a name="sap-hana-pacemaker-sles-ops-alert-monitor"></a>

本节涵盖以下主题。

**Topics**
+ [使用 Amazon CloudWatch 应用程序见解](#_using_amazon_cloudwatch_application_insights)
+ [使用集群警报代理](#_using_the_cluster_alert_agents)

## 使用 Amazon CloudWatch 应用程序见解
<a name="_using_amazon_cloudwatch_application_insights"></a>

为了监控和查看集群状态及操作，Application Insights 提供了多个指标用于监控入队复制状态、集群指标以及 SAP 和高可用性检查。EFS 和 CPU 监控等其他指标也有助于进行根本原因分析。

有关更多信息，请参阅[亚马逊 CloudWatch 应用程序见解入门和亚马逊](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/appinsights-getting-started.html)[上的 SAP HANA 高可用性 EC2](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/component-configuration-examples-hana-ha.html)。

## 使用集群警报代理
<a name="_using_the_cluster_alert_agents"></a>

在集群配置中，您可以调用外部程序（警报代理）来处理警报。这是一条*推送*通知。它通过环境变量传递有关事件的信息。

然后，您可以配置代理来发送电子邮件、记录到文件、更新监控系统等。例如，以下脚本可用于访问 Amazon SNS。

```
#!/bin/sh

# alert_sns.sh
# modified from /usr/share/pacemaker/alerts/alert_smtp.sh.sample

##############################################################################
# SETUP
# * Create an SNS Topic and subscribe email or chatbot
# * Note down the ARN for the SNS topic
# * Give the IAM Role attached to both Instances permission to publish to the SNS Topic
# * Ensure the aws cli is installed
# * Copy this file to /usr/share/pacemaker/alerts/alert_sns.sh or other location on BOTH nodes
# * Ensure the permissions allow for hacluster and root to execute the script
# * Run the following as root (modify file location if necessary and replace SNS ARN):
#
# SLES:
# crm configure alert aws_sns_alert /usr/share/pacemaker/alerts/alert_sns.sh meta timeout=30s timestamp-format="%Y-%m-%d_%H:%M:%S" to <{ arn:aws:sns:region:account-id:myPacemakerAlerts  }>
#
# RHEL:
# pcs alert create id=aws_sns_alert path=/usr/share/pacemaker/alerts/alert_sns.sh meta timeout=30s timestamp-format="%Y-%m-%d_%H:%M:%S"
# pcs alert recipient add aws_sns_alert value=arn:aws:sns:region:account-id:myPacemakerAlerts
##############################################################################

# Additional information to send with the alerts
node_name=`uname -n`
sns_body=`env | grep CRM_alert_`

# Required for SNS
TOKEN=$(/usr/bin/curl --noproxy '*' -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

# Get metadata
REGION=$(/usr/bin/curl --noproxy '*' -w "\n" -s -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/dynamic/instance-identity/document | grep region | awk -F\" '{print $4}')

sns_subscription_arn=${CRM_alert_recipient}

# Format depending on alert type
case ${CRM_alert_kind} in
   node)
     sns_subject="${CRM_alert_timestamp} ${cluster_name}: Node '${CRM_alert_node}' is now '${CRM_alert_desc}'"
   ;;
   fencing)
     sns_subject="${CRM_alert_timestamp} ${cluster_name}: Fencing ${CRM_alert_desc}"
   ;;
   resource)
     if [ ${CRM_alert_interval} = "0" ]; then
         CRM_alert_interval=""
     else
         CRM_alert_interval=" (${CRM_alert_interval})"
     fi
     if [ ${CRM_alert_target_rc} = "0" ]; then
         CRM_alert_target_rc=""
     else
         CRM_alert_target_rc=" (target: ${CRM_alert_target_rc})"
     fi
     case ${CRM_alert_desc} in
         Cancelled)
           ;;
         *)
           sns_subject="${CRM_alert_timestamp}: Resource operation '${CRM_alert_task}${CRM_alert_interval}' for '${CRM_alert_rsc}' on '${CRM_alert_node}': ${CRM_alert_desc}${CRM_alert_target_rc}"
           ;;
     esac
     ;;
   attribute)
     sns_subject="${CRM_alert_timestamp}: The '${CRM_alert_attribute_name}' attribute of the '${CRM_alert_node}' node was updated in '${CRM_alert_attribute_value}'"
     ;;
   *)
     sns_subject="${CRM_alert_timestamp}: Unhandled $CRM_alert_kind alert"
     ;;
esac

# Use this information to send the email
aws sns publish --topic-arn "${sns_subscription_arn}" --subject "${sns_subject}" --message "${sns_body}" --region ${REGION}
```