

# PERF02-BP03 收集与计算相关的指标
<a name="perf_compute_hardware_collect_compute_related_metrics"></a>

 记录和跟踪与计算相关的指标，以便更好地了解计算资源的表现情况，并提高计算资源的性能和利用率。

 **常见反模式：**
+  只手动搜索日志文件来查找指标。  
+  只使用由监控软件记录的默认指标。
+  只在出现问题时审查指标。

 **建立此最佳实践的好处：**收集与性能相关的指标有助于您根据业务要求调整应用程序性能，从而确保满足工作负载需求。收集指标还有利于您持续提高工作负载中的资源性能和利用率。

 **在未建立这种最佳实践的情况下暴露的风险等级：**高 

## 实施指导
<a name="implementation-guidance"></a>

 云工作负载会生成大量数据，例如指标、日志和事件。在 AWS 云 中，收集指标是提高安全性、成本效率、性能和可持续性的关键步骤。AWS 使用监控服务（如 [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/)）提供各种与性能相关的指标，从而为您提供宝贵的洞察。CPU 利用率、内存利用率、磁盘 I/O 以及网络入站和出站等指标有助于您深入了解利用率水平或性能瓶颈。将这些指标用作数据驱动方法的一部分，以便主动调整和优化工作负载的资源。  理想情况下，您应该在单一平台上收集与计算资源相关的所有指标，并实施留存策略以支持成本目标和运营目标。

## 实施步骤
<a name="implementation-steps"></a>
+  确定哪些与性能相关的指标与您的工作负载相关。您应该收集有关资源利用率和云工作负载运行方式的指标（例如响应时间和吞吐量）。
  +  [Amazon EC2 默认指标](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/viewing_metrics_with_cloudwatch.html) 
  +  [Amazon ECS 默认指标](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch-metrics.html) 
  +  [Amazon EKS 默认指标](https://docs.aws.amazon.com/prescriptive-guidance/latest/implementing-logging-monitoring-cloudwatch/kubernetes-eks-metrics.html) 
  +  [Lambda 默认指标](https://docs.aws.amazon.com/lambda/latest/dg/monitoring-functions-access-metrics.html) 
  +  [Amazon EC2 内存和磁盘指标](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/mon-scripts.html) 
+  为工作负载选择并设置合适的日志记录和监控解决方案。
  +  [AWS native Observability](https://catalog.workshops.aws/observability/en-US/aws-native) 
  +  [适用于 OpenTelemetry 的 AWS Distro](https://aws.amazon.com/otel/) 
  +  [Amazon Managed Service for Prometheus](https://docs.aws.amazon.com/grafana/latest/userguide/prometheus-data-source.html) 
+  根据工作负载要求为指标确定所需的筛选和聚合。
  +  [Quantify custom application metrics with Amazon CloudWatch Logs and metric filters](https://aws.amazon.com/blogs/mt/quantify-custom-application-metrics-with-amazon-cloudwatch-logs-and-metric-filters/) 
  +  [Collect custom metrics with Amazon CloudWatch strategic tagging](https://aws.amazon.com/blogs/infrastructure-and-automation/collect-custom-metrics-with-amazon-cloudwatch-strategic-tagging/) 
+  为指标配置数据留存策略，从而符合安全目标和运营目标。
  +  [CloudWatch 指标的默认数据留存](https://aws.amazon.com/cloudwatch/faqs/#AWS_resource_.26_custom_metrics_monitoring) 
  +  [CloudWatch Logs 的默认数据留存](https://aws.amazon.com/cloudwatch/faqs/#Log_management) 
+  如有需要，可为指标创建警报和通知，协助您主动应对与性能相关的问题。
  +  [Create alarms for custom metrics using Amazon CloudWatch anomaly detection](https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/create-alarms-for-custom-metrics-using-amazon-cloudwatch-anomaly-detection.html) 
  +  [Create metrics and alarms for specific web pages with Amazon CloudWatch RUM](https://aws.amazon.com/blogs/mt/create-metrics-and-alarms-for-specific-web-pages-amazon-cloudwatch-rum/) 
+  使用自动化技术来部署指标和日志聚合代理。
  +  [AWS Systems Manager 自动化](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-automation.html?ref=wellarchitected) 
  +  [OpenTelemetry Collector](https://aws-otel.github.io/docs/getting-started/collector) 

## 资源
<a name="resources"></a>

 **相关文档：**
+  [监控和可观测性](https://aws.amazon.com/cloudops/monitoring-and-observability/) 
+  [Best practices: implementing observability with AWS](https://aws.amazon.com/blogs/mt/best-practices-implementing-observability-with-aws/) 
+  [Amazon CloudWatch 文档](https://docs.aws.amazon.com/cloudwatch/index.html?ref=wellarchitected) 
+  [使用 CloudWatch 代理从 Amazon EC2 实例和本地部署服务器中收集指标和日志](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html?ref=wellarchitected) 
+  [访问 AWS Lambda 的 Amazon CloudWatch Logs](https://docs.aws.amazon.com/lambda/latest/dg/monitoring-functions-logs.html?ref=wellarchitected) 
+  [将 CloudWatch Logs 与容器实例结合使用](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_cloudwatch_logs.html?ref=wellarchitected) 
+  [发布自定义指标](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/publishingMetrics.html?ref=wellarchitected) 
+  [AWS Answers：集中式日志记录](https://aws.amazon.com/answers/logging/centralized-logging/?ref=wellarchitected) 
+  [发布 CloudWatch 指标的 AWS 服务](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CW_Support_For_AWS.html?ref=wellarchitected) 
+  [Monitoring Amazon EKS on AWS Fargate](https://aws.amazon.com/blogs/containers/monitoring-amazon-eks-on-aws-fargate-using-prometheus-and-grafana/) 

 **相关视频：**
+  [AWS re:Invent 2023 – [LAUNCH] Application monitoring for modern workloads](https://www.youtube.com/watch?v=T2TovTLje8w) 
+  [AWS re:Invent 2023 – Implementing application observability](https://www.youtube.com/watch?v=IcTcwUSwIs4) 
+  [AWS re:Invent 2023 – Building an effective observability strategy](https://www.youtube.com/watch?v=7PQv9eYCJW8) 
+  [AWS re:Invent 2023 – Seamless observability with AWS Distro for OpenTelemetry](https://www.youtube.com/watch?v=S4GfA2R0N_A) 
+  [Application Performance Management on AWS](https://www.youtube.com/watch?v=5T4stR-HFas&ref=wellarchitected) 

 **相关示例：**
+  [AWS for Linux Workloads Immersion Day- Amazon CloudWatch](https://catalog.us-east-1.prod.workshops.aws/workshops/a8e9c6a6-0ba9-48a7-a90d-378a440ab8ba/en-US/300-cloudwatch) 
+  [Monitoring Amazon ECS clusters and containers](https://ecsworkshop.com/monitoring/) 
+  [Monitoring with Amazon CloudWatch dashboards](https://catalog.workshops.aws/well-architected-performance-efficiency/en-US/3-monitoring/monitoring-with-cloudwatch-dashboards) 
+  [Amazon EKS 讲习会](https://www.eksworkshop.com/) 