# Amazon EKS および Kubernetes Container Insights のメトリクス
<a name="Container-Insights-metrics-EKS"></a>

以下の表は、Container Insights で収集する Amazon EKS および Kubernetes のメトリクスとディメンションを示しています。これらのメトリクスは `ContainerInsights` 名前空間にあります。詳細については、「[メトリクス](cloudwatch_concepts.md#Metric)」を参照してください。

コンソールに Container Insights メトリクスが表示されない場合は、Container Insights のセットアップが完了していることを確認します。メトリクスは、Container Insights が完全にセットアップされるまで表示されません。詳細については、「[Container Insights の設定](deploy-container-insights.md)」を参照してください。


| メトリクス名 | ディメンション | 説明 | 
| --- | --- | --- | 
| `cluster_failed_node_count` | `ClusterName` | クラスター内の失敗したワーカーノードの数。*ノードの状態*に何らかの問題がある場合は、そのノードは失敗したとみなされます。詳細については、Kubernetes ドキュメントの 「[Conditions (状態)](https://kubernetes.io/docs/concepts/architecture/nodes/#condition)」を参照してください。 | 
| `cluster_node_count` | `ClusterName` | クラスター内のワーカーノードの総数。 | 
| `namespace_number_of_running_pods` | `Namespace` `ClusterName`<br />`ClusterName` | 使用しているディメンションによって指定されたリソースの名前空間ごとに実行されているポッドの数。 | 
| `node_cpu_limit` | `ClusterName`  | このクラスター内の単一のノードに割り当てることができる CPU ユニットの最大数。 | 
| `node_cpu_reserved_capacity` | `NodeName`, `ClusterName`, `InstanceId`<br />`ClusterName` | ノードコンポーネント (kubelet、kube-proxy、Docker など) に予約されている CPU ユニットの割合。<br />計算式: `node_cpu_request / node_cpu_limit` `node_cpu_request` はメトリクスとして直接報告されませんが、パフォーマンスログイベント内のフィールドです。詳細については、「[Amazon EKS と Kubernetes のパフォーマンスログイベントの関連フィールド](Container-Insights-reference-performance-entries-EKS.md)」を参照してください。  | 
| `node_cpu_usage_total` | `ClusterName` | クラスターのノードで使用されている CPU ユニットの数。 | 
| `node_cpu_utilization` | `NodeName`, `ClusterName`, `InstanceId`<br />`ClusterName` | クラスター内のノードで使用されている CPU ユニットの合計使用率。<br />計算式: `node_cpu_usage_total / node_cpu_limit` | 
| `node_gpu_limit` | `ClusterName`<br />`ClusterName`, `InstanceId`, `NodeName` | ノードで使用可能な GPU の合計数。 | 
| `node_gpu_usage_total` | `ClusterName`<br />`ClusterName`, `InstanceId`, `NodeName` | ノードで実行中のポッドによって使用されている GPU の数。 | 
| `node_gpu_reserved_capacity` | `ClusterName`<br />`ClusterName`, `InstanceId`, `NodeName` | ノードで現在予約されている GPU の割合。式は `node_gpu_request / node_gpu_limit` です。 `node_gpu_request` はメトリクスとして直接報告されませんが、パフォーマンスログイベント内のフィールドです。詳細については、「[Amazon EKS と Kubernetes のパフォーマンスログイベントの関連フィールド](Container-Insights-reference-performance-entries-EKS.md)」を参照してください。  | 
| `node_filesystem_utilization` | `NodeName`, `ClusterName`, `InstanceId`<br />`ClusterName` | クラスター内のノードで使用されているファイルシステム容量の合計使用率。<br />計算式: `node_filesystem_usage / node_filesystem_capacity` `node_filesystem_usage` および `node_filesystem_capacity` はメトリクスとして直接報告されませんが、パフォーマンスログイベント内のフィールドです。詳細については、「[Amazon EKS と Kubernetes のパフォーマンスログイベントの関連フィールド](Container-Insights-reference-performance-entries-EKS.md)」を参照してください。  | 
| `node_memory_limit` | `ClusterName` | このクラスター内の単一のノードに割り当てることができるメモリの最大量 (バイト単位)。 | 
| `node_memory_reserved_capacity` | `NodeName`, `ClusterName`, `InstanceId`<br />`ClusterName` | クラスター内のノードで現在使用されているメモリの割合。<br />計算式: `node_memory_request / node_memory_limit` `node_memory_request` はメトリクスとして直接報告されませんが、パフォーマンスログイベント内のフィールドです。詳細については、「[Amazon EKS と Kubernetes のパフォーマンスログイベントの関連フィールド](Container-Insights-reference-performance-entries-EKS.md)」を参照してください。  | 
| `node_memory_utilization` | `NodeName`, `ClusterName`, `InstanceId`<br />`ClusterName` | ノードによって現在使用されているメモリの割合。これは、ノードのメモリ制限で割られたノードのメモリ使用量の割合です。<br />計算式: `node_memory_working_set / node_memory_limit`  | 
| `node_memory_working_set` | `ClusterName`  | クラスターで現在稼働しているノードのセットで使用されているメモリの量 (バイト単位)。 | 
| `node_network_total_bytes` | `NodeName`, `ClusterName`, `InstanceId`<br />`ClusterName` | クラスターのノードごとにネットワーク経由で送信および受信された合計バイト数。<br />計算式: `node_network_rx_bytes + node_network_tx_bytes` `node_network_rx_bytes` および `node_network_tx_bytes` はメトリクスとして直接報告されませんが、パフォーマンスログイベント内のフィールドです。詳細については、「[Amazon EKS と Kubernetes のパフォーマンスログイベントの関連フィールド](Container-Insights-reference-performance-entries-EKS.md)」を参照してください。  | 
| `node_number_of_running_containers` | `NodeName`, `ClusterName`, `InstanceId`<br />`ClusterName` | クラスターのノードごとに実行中のコンテナの数。 | 
| `node_number_of_running_pods` | `NodeName`, `ClusterName`, `InstanceId`<br />`ClusterName` | クラスターのノードごとに実行中のポッドの数。 | 
| `pod_cpu_reserved_capacity` | `PodName`, `Namespace`, `ClusterName`<br />`ClusterName` | クラスターのポッドごとに予約されている CPU 容量。<br />計算式: `pod_cpu_request / node_cpu_limit` `pod_cpu_request` はメトリクスとして直接報告されませんが、パフォーマンスログイベント内のフィールドです。詳細については、「[Amazon EKS と Kubernetes のパフォーマンスログイベントの関連フィールド](Container-Insights-reference-performance-entries-EKS.md)」を参照してください。  | 
| `pod_cpu_utilization` | `PodName`, `Namespace`, `ClusterName`<br />`Namespace`, `ClusterName`<br />`Service`, `Namespace`, `ClusterName`<br />`ClusterName` | ポッドで使用されている CPU ユニットの割合。<br />計算式: `pod_cpu_usage_total / node_cpu_limit` | 
| `pod_cpu_utilization_over_pod_limit` | `PodName`, `Namespace`, `ClusterName`<br />`Namespace`, `ClusterName`<br />`Service`, `Namespace`, `ClusterName`<br />`ClusterName` | ポッドの制限に対する、ポッドで使用されている CPU ユニットの割合。<br />計算式: `pod_cpu_usage_total / pod_cpu_limit` | 
| `pod_gpu_request` | `ClusterName`<br />`ClusterName`, `Namespace`, `PodName`<br />`ClusterName`, `FullPodName`, `Namespace`, `PodName` | ポッドの GPU リクエスト。この値は常に `pod_gpu_limit` と等しいことが必要です。 | 
| `pod_gpu_limit` | `ClusterName`<br />`ClusterName`, `Namespace`, `PodName`<br />`ClusterName`, `FullPodName`, `Namespace`, `PodName` | ノード内のポッドに割り当てることができる GPU の最大数。 | 
| `pod_gpu_usage_total` | `ClusterName`<br />`ClusterName`, `Namespace`, `PodName`<br />`ClusterName`, `FullPodName`, `Namespace`, `PodName` | ポッドに割り当てられる GPU の数。 | 
| `pod_gpu_reserved_capacity` | `ClusterName`<br />`ClusterName`, `Namespace`, `PodName`<br />`ClusterName`, `FullPodName`, `Namespace`, `PodName` | 現在ポッド用に予約されている GPU の割合。式は pod\_gpu\_request / node\_gpu\_reserved\_capacity です。 | 
| `pod_memory_reserved_capacity` | `PodName`, `Namespace`, `ClusterName`<br />`ClusterName` | ポッド用に予約されているメモリの割合。<br />計算式: `pod_memory_request / node_memory_limit` `pod_memory_request` はメトリクスとして直接報告されませんが、パフォーマンスログイベント内のフィールドです。詳細については、「[Amazon EKS と Kubernetes のパフォーマンスログイベントの関連フィールド](Container-Insights-reference-performance-entries-EKS.md)」を参照してください。  | 
| `pod_memory_utilization` | `PodName`, `Namespace`, `ClusterName`<br />`Namespace`, `ClusterName`<br />`Service`, `Namespace`, `ClusterName`<br />`ClusterName` | ポッドが現在使用しているメモリの割合。<br />計算式: `pod_memory_working_set / node_memory_limit` | 
| `pod_memory_utilization_over_pod_limit` | `PodName`, `Namespace`, `ClusterName`<br />`Namespace`, `ClusterName`<br />`Service`, `Namespace`, `ClusterName`<br />`ClusterName` | ポッドの制限に対する、ポッドで使用されているメモリの割合。ポッドのいずれかのコンテナに、定義されたメモリ制限がない場合、このメトリクスは表示されません。<br />計算式: `pod_memory_working_set / pod_memory_limit` | 
| `pod_network_rx_bytes` | `PodName`, `Namespace`, `ClusterName`<br />`Namespace`, `ClusterName`<br />`Service`, `Namespace`, `ClusterName`<br />`ClusterName` | ポッドによって、ネットワーク経由で受信されているバイト数。<br />計算式: `sum(pod_interface_network_rx_bytes)` `pod_interface_network_rx_bytes` はメトリクスとして直接報告されませんが、パフォーマンスログイベント内のフィールドです。詳細については、「[Amazon EKS と Kubernetes のパフォーマンスログイベントの関連フィールド](Container-Insights-reference-performance-entries-EKS.md)」を参照してください。  | 
| `pod_network_tx_bytes` | `PodName`, `Namespace`, `ClusterName`<br />`Namespace`, `ClusterName`<br />`Service`, `Namespace`, `ClusterName`<br />`ClusterName` | ポッドによって、ネットワーク経由で送信されているバイト数。<br />計算式: `sum(pod_interface_network_tx_bytes)` `pod_interface_network_tx_bytes` はメトリクスとして直接報告されませんが、パフォーマンスログイベント内のフィールドです。詳細については、「[Amazon EKS と Kubernetes のパフォーマンスログイベントの関連フィールド](Container-Insights-reference-performance-entries-EKS.md)」を参照してください。  | 
| `pod_number_of_container_restarts` | `PodName`, `Namespace`, `ClusterName` | ポッドでのコンテナ再起動の合計数。 | 
| `service_number_of_running_pods` | `Service`, `Namespace`, `ClusterName`<br />`ClusterName` | クラスターでサービス (1 つまたは複数) を実行しているポッドの数。 | 

## Kueue メトリクス
<a name="Container-Insights-metrics-Kueue"></a>

CloudWatch Observability EKS アドオンのバージョン `v2.4.0-eksbuild.1` 以降、Container Insights for Amazon EKS は Amazon EKS クラスターから Kueue メトリクスの収集をサポートします。アドオンの詳細については、「[Amazon CloudWatch Observability EKS アドオンまたは Helm チャートを使用して CloudWatch エージェントをインストールする](install-CloudWatch-Observability-EKS-addon.md)」を参照してください。

メトリクスを有効にする方法の詳細については、「[Kueue メトリクスの有効化](install-CloudWatch-Observability-EKS-addon.md#enable-Kueue-metrics)」を参照して有効にしてください。

収集される Kueue メトリクスは、次の表に一覧表示されます。これらのメトリクスは、CloudWatch の `ContainerInsights/Prometheus` 名前空間内で発行されます。これらのメトリクスの一部は次のディメンションを使用します。
+ `ClusterQueue` は ClusterQueue の名前です
+ `Status` の想定される値は `active` および `inadmissible` です
+ `Reason` の想定される値は `Preempted`、`PodsReadyTimeout`、`AdmissionCheck`、`ClusterQueueStopped`、`InactiveWorkload` です
+ `Flavor` は参照されるフレーバーです。
+ `Resource` は `cpu`、`memory`、`gpu` などのクラスターコンピュータリソースを指します。


| メトリクス名 | ディメンション | 説明 | 
| --- | --- | --- | 
| `kueue_pending_workloads` | `ClusterName`, `ClusterQueue`, `Status`<br />`ClusterName`, `ClusterQueue`<br />`ClusterName`, `Status`<br />`ClusterName` | 保留中のワークロードの数。 | 
| `kueue_evicted_workloads_total` | `ClusterName`, `ClusterQueue`, `Reason`<br />`ClusterName`, `ClusterQueue`<br />`ClusterName`, `Reason`<br />`ClusterName` | 削除されたワークロードの合計数。 | 
| `kueue_admitted_active_workloads` | `ClusterName`, `ClusterQueue`<br />`ClusterName` | アクティブな許可されたワークロードの数 (停止されておらず、完了していないもの)。 | 
| `kueue_cluster_queue_resource_usage` | `ClusterName`, `ClusterQueue`, `Resource`, `Flavor`<br />`ClusterName`, `ClusterQueue`, `Resource`<br />`ClusterName`, `ClusterQueue`, `Flavor`<br />`ClusterName`, `ClusterQueue`<br />`ClusterName` | ClusterQueue のリソース使用量の合計をレポートします。 | 
| `kueue_cluster_queue_nominal_quota` | `ClusterName`, `ClusterQueue`, `Resource`, `Flavor`<br />`ClusterName`, `ClusterQueue`, `Resource`<br />`ClusterName`, `ClusterQueue`, `Flavor`<br />`ClusterName`, `ClusterQueue`<br />`ClusterName` | ClusterQueue のリソースクォータをレポートします。 |