# Logging and monitoring
<a name="logging-monitoring"></a>

Monitoring is an important part of maintaining the reliability, availability, and performance of EMR Serverless applications and jobs. You should collect monitoring data from all of the parts of your EMR Serverless solutions so that multipoint failures can be debugged more easily if they occur.

**Topics**
+ [

# Storing logs
](logging.md)
+ [

# Rotating logs
](rotating-logs.md)
+ [

# Encrypting logs
](jobs-log-encryption.md)
+ [

# Configure Apache Log4j2 properties for Amazon EMR Serverless
](log4j2.md)
+ [

# Monitoring EMR Serverless
](metrics.md)
+ [

# Automating EMR Serverless with Amazon EventBridge
](using-eventbridge.md)

# Storing logs
<a name="logging"></a>

To monitor your job progress on EMR Serverless and troubleshoot job failures, choose how EMR Serverless stores and serves application logs. When you submit a job run, specify managed storage, Amazon S3, and Amazon CloudWatch as your logging options.

With CloudWatch, specify the log types and log locations that you want to use, or accept the default types and locations. For more information on CloudWatch logs, refer to [Logging for EMR Serverless with Amazon CloudWatch](#jobs-log-storage-cw). With managed storage and S3 logging, the following table lists the log locations and UI availability that you can expect if you choose [managed storage](#jobs-log-storage-managed-storage), [Amazon S3 buckets](#jobs-log-storage-s3-buckets), or both.


| Option | Event logs | Container logs | Application UI | 
| --- | --- | --- | --- | 
|  Managed storage  |  Stored in managed storage  |  Stored in managed storage  |  Supported  | 
|  Both managed storage and S3 bucket  |  Stored in both places  |  Stored in S3 bucket  |  Supported  | 
|  Amazon S3 bucket  |  Stored in S3 bucket  |  Stored in S3 bucket  |  Not supported1  | 

1 We suggest that you keep the **Managed storage** option selected. Otherwise, you can't use the built-in application UIs.

## Logging for EMR Serverless with managed storage
<a name="jobs-log-storage-managed-storage"></a>

By default, EMR Serverless stores application logs securely in Amazon EMR managed storage for a maximum of 30 days.

**Note**  
If you turn off the default option, Amazon EMR can't troubleshoot your jobs on your behalf. Example: You cannot access Spark-UI from the EMR Serverless Console.

To turn off this option from EMR Studio, deselect the **Allow AWS to retain logs for 30 days** check box in the **Additional settings** section of the **Submit job** page. 

To turn off this option from the AWS CLI, use the `managedPersistenceMonitoringConfiguration` configuration when you submit a job run.

```
{
    "monitoringConfiguration": {
        "managedPersistenceMonitoringConfiguration": {
            "enabled": false
        }
    }
}
```

If your EMR Serverless application is in a private subnet with VPC endpoints for Amazon S3 and you attach an endpoint policy to control access, add the following permissions for EMR Serverless to store and serve application logs. Replace `Resource` with the `AppInfo` buckets from the available regions table in [Sample policies for private subnets that access Amazon S3](https://docs.aws.amazon.com/emr/latest/ManagementGuide/private-subnet-iampolicy.html#private-subnet-iampolicy-regions).

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Sid": "EMRServerlessManagedLogging",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:PutObjectAcl"
      ],
      "Resource": [
        "arn:aws:s3:::prod.us-east-1.appinfo.src",
        "arn:aws:s3:::prod.us-east-1.appinfo.src/*"
      ],
      "Condition": {
        "StringEquals": {
          "aws:PrincipalServiceName": "emr-serverless.amazonaws.com",
          "aws:SourceVpc": "vpc-12345678"
        }
      }
    }
  ]
}
```

------

Additionally, use the `aws:SourceVpc` condition key to ensure that the request travels through the VPC that the VPC endpoint is attached to.

## Logging for EMR Serverless with Amazon S3 buckets
<a name="jobs-log-storage-s3-buckets"></a>

Before your jobs can send log data to Amazon S3, include the following permissions in the permissions policy for the job runtime role. Replace `amzn-s3-demo-logging-bucket` with the name of your logging bucket.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-bucket/*"
      ],
      "Sid": "AllowS3Putobject"
    }
  ]
}
```

------

To set up an Amazon S3 bucket to store logs from the AWS CLI, use the `s3MonitoringConfiguration` configuration when you start a job run. To do this, provide the following `--configuration-overrides` in the configuration. 

```
{
    "monitoringConfiguration": {
        "s3MonitoringConfiguration": {
            "logUri": "s3://amzn-s3-demo-logging-bucket/logs/"
        }
    }
}
```

For batch jobs that don't have retries enabled, EMR Serverless sends the logs to the following path:

```
'/applications/<applicationId>/jobs/<jobId>'
```

Spark driver logs are stored in the following path by EMR Serverless

```
'/applications/<applicationId>/jobs/<jobId>/SPARK_DRIVER/'
```

Spark executor logs are stored in the following path by EMR Serverless

```
'/applications/<applicationId>/jobs/<jobId>/SPARK_EXECUTOR/<EXECUTOR-ID>'
```

The <EXECUTOR-ID> is an integer.

EMR Serverless releases 7.1.0 and higher support retry attempts for streaming jobs and batch jobs. If you run a job with retries enabled, EMR Serverless automatically adds an attempt number to the log path prefix, so you can better distinguish and track logs.

```
'/applications/<applicationId>/jobs/<jobId>/attempts/<attemptNumber>/'
```

## Logging for EMR Serverless with Amazon CloudWatch
<a name="jobs-log-storage-cw"></a>

When you submit a job to an EMR Serverless application, choose Amazon CloudWatch as an option to store your application logs. This allows you to use CloudWatch log analysis features such as CloudWatch Logs Insights and Live Tail. You can also stream logs from CloudWatch to other systems such as OpenSearch for further analysis.

EMR Serverless provides real-time logging for driver logs. You can access the logs in real time with the CloudWatch live tail capability, or through CloudWatch CLI tail commands.

By default, CloudWatch logging is disabled for EMR Serverless. To enable it, use the configuration in [AWS CLI](#jobs-log-storage-cw-cli).

**Note**  
Amazon CloudWatch publishes logs in real time, so it incurs more resources from workers. If you choose a low worker capacity, the impact to your job run time might increase. If you enable CloudWatch logging, we suggest that you choose a greater worker capacity. It's also possible that log publication could throttle if the transactions per second (TPS) rate is too low for `PutLogEvents`. The CloudWatch throttling configuration is global to all services, including EMR Serverless. For more information, refer to [How do I determine throttling in my CloudWatch logs?](https://repost.aws/knowledge-center/cloudwatch-logs-throttling) on *AWS re:post*.

### Required permissions for logging with CloudWatch
<a name="jobs-log-storage-cw-permissions"></a>

Before your jobs can send log data to Amazon CloudWatch, include the following permissions in the permissions policy for the job runtime role.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:DescribeLogGroups"
      ],
      "Resource": [
        "arn:aws:logs:*:123456789012:*"
      ],
      "Sid": "AllowLOGSDescribeloggroups"
    },
    {
      "Effect": "Allow",
      "Action": [
        "logs:PutLogEvents",
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:DescribeLogStreams"
      ],
      "Resource": [
        "arn:aws:logs:*:123456789012:log-group:my-log-group-name:*"
      ],
      "Sid": "AllowLOGSPutlogevents"
    }
  ]
}
```

------

### AWS CLI
<a name="jobs-log-storage-cw-cli"></a>

To set up Amazon CloudWatch to store logs for EMR Serverless from the AWS CLI, use the `cloudWatchLoggingConfiguration` configuration when you start a job run. To do this, provide the following configuration overrides. Optionally, also provide a log group name, log stream prefix name, log types, and an encryption key ARN.

If you don’t specify the optional values, then CloudWatch publishes the logs to a default log group `/aws/emr-serverless`, with the default log stream `/applications/applicationId/jobs/jobId/worker-type`.

EMR Serverless releases 7.1.0 and higher support retry attempts for streaming jobs and batch jobs. If you enabled retries for a job, EMR Serverless automatically adds an attempt number to the log path prefix, so you can better distinguish and track logs. 

```
'/applications/<applicationId>/jobs/<jobId>/attempts/<attemptNumber>/worker-type'
```

The following demonstrates the minimum configuration that is required to turn on Amazon CloudWatch logging with the default settings for EMR Serverless:

```
{
    "monitoringConfiguration": {
        "cloudWatchLoggingConfiguration": {
            "enabled": true
         }
     }
}
```

The following example shows all of the required and optional configurations that specify when you turn on Amazon CloudWatch logging for EMR Serverless. The supported `logTypes` values are also listed in the following this example.

```
{
    "monitoringConfiguration": {
        "cloudWatchLoggingConfiguration": {
            "enabled": true, // Required
            "logGroupName": "Example_logGroup", // Optional
            "logStreamNamePrefix": "Example_logStream", // Optional 
            "encryptionKeyArn": "key-arn", // Optional 
            "logTypes": { 
                "SPARK_DRIVER": ["stdout", "stderr"] //List of values
             }
         }
     }
}
```

By default, EMR Serverless publishes only the driver stdout and stderr logs to CloudWatch. If you want other logs, then specify a container role and corresponding log types with the `logTypes` field.

The following list shows the supported worker types that specify for the `logTypes` configuration:

**Spark**  
+ `SPARK_DRIVER : ["STDERR", "STDOUT"]`
+ `SPARK_EXECUTOR : ["STDERR", "STDOUT"]`

**Hive**  
+ `HIVE_DRIVER : ["STDERR", "STDOUT", "HIVE_LOG", "TEZ_AM"]`
+ `TEZ_TASK : ["STDERR", "STDOUT", "SYSTEM_LOGS"]`

# Rotating logs
<a name="rotating-logs"></a>

Amazon EMR Serverless can rotate Spark application logs and event logs. Log rotation helps with the issue of long running jobs generating large log files that can take up all of your disk space. Rotating logs helps you save disk storage and reduces the amount of job failures because you have no more space left on your disk. 

Log rotation is enabled by default and is available only for Spark jobs.

**Spark event logs**

**Note**  
Spark event log rotation is available across all Amazon EMR release labels.

Instead of generating a single event log file, EMR Serverless rotates the event log at a regular time interval and removes the older event log files. Rotating logs doesn't affect the logs uploaded to the S3 bucket.

**Spark application logs**

**Note**  
Spark application log rotation is available across all Amazon EMR release labels.

EMR Serverless also rotates the spark application logs for drivers and executors, such as `stdout` and `stderr` files. You can access the latest log files by choosing the log links in Studio by using the Spark History Server and Live UI links. Log files are the truncated versions of the latest logs. To refer to the older rotated logs, specify an Amazon S3 location when storing logs. Refer to [ Logging for EMR Serverless with Amazon S3 buckets](https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/logging.html#jobs-log-storage-s3-buckets) for more information.

You can find the latest log files at the following location. EMR Serverless refreshes the files every 15 seconds. These files can range from 0 MB to 128 MB.

```
<example-S3-logUri>/applications/<application-id>/jobs/<job-id>/SPARK_DRIVER/stderr.gz
```

The following location contains the older rotated files. Each file is 128 MB.

```
<example-S3-logUri>/applications/<application-id>/jobs/<job-id>/SPARK_DRIVER/archived/stderr_<index>.gz 
```

The same behavior applies to Spark executors as well. This change is only applicable to S3 logging. Log rotation doesn't introduce any changes to log streams uploaded to Amazon CloudWatch.

EMR Serverless releases 7.1.0 and higher support retry attempts for streaming and batch jobs. If you enabled retry attempts with your job, EMR Serverless adds a prefix to the log path for such jobs so you can better track and distinguish the logs from one another. This path contains all rotated logs.

```
'/applications/<applicationId>/jobs/<jobId>/attempts/<attemptNumber>/'.
```

# Encrypting logs
<a name="jobs-log-encryption"></a>

## Encrypting EMR Serverless logs with managed storage
<a name="jobs-log-encryption-managed-storage"></a>

To encrypt logs in managed storage with your own KMS key, use the `managedPersistenceMonitoringConfiguration` configuration when you submit a job run.

```
{
    "monitoringConfiguration": {
        "managedPersistenceMonitoringConfiguration" : {
            "encryptionKeyArn": "key-arn"
        }
    }
}
```

## Encrypting EMR Serverless logs with Amazon S3 buckets
<a name="jobs-log-encryption-s3-buckets"></a>

To encrypt logs in your Amazon S3 bucket with your own KMS key, use the `s3MonitoringConfiguration` configuration when you submit a job run.

```
{
    "monitoringConfiguration": {
        "s3MonitoringConfiguration": {
            "logUri": "s3://amzn-s3-demo-logging-bucket/logs/",
            "encryptionKeyArn": "key-arn"
        }
    }
}
```

## Encrypting EMR Serverless logs with Amazon CloudWatch
<a name="jobs-log-encryption-cw"></a>

To encrypt logs in Amazon CloudWatch with your own KMS key, use the `cloudWatchLoggingConfiguration` configuration when you submit a job run.

```
{
    "monitoringConfiguration": {
        "cloudWatchLoggingConfiguration": {
            "enabled": true,
            "encryptionKeyArn": "key-arn"
         }
     }
}
```

## Required permissions for log encryption
<a name="jobs-log-encryption-permissions"></a>

**Topics**
+ [

### Required user permissions
](#jobs-log-encryption-permissions-user)
+ [

### Encryption key permissions for Amazon S3 and managed storage
](#jobs-log-encryption-permissions-s3)
+ [

### Encryption key permissions for Amazon CloudWatch
](#jobs-log-encryption-permissions-cw)

### Required user permissions
<a name="jobs-log-encryption-permissions-user"></a>

The user who submits the job or views the logs or the application UIs must have permissions to use the key. You can specify the permissions in either the KMS key policy or the IAM policy for the user, group, or role. If the user who submits the job lacks the KMS key permissions, EMR Serverless rejects the job run submission.

**Example key policy**

The following key policy provides the permissions to `kms:GenerateDataKey` and `kms:Decrypt`:

```
{
    "Effect": "Allow",
    "Principal":{
       "AWS": "arn:aws:iam::111122223333:user/user-name"
     },
     "Action": [
       "kms:GenerateDataKey",       
       "kms:Decrypt"
      ],
     "Resource": "*"
 }
```

**Example IAM policy**

The following IAM policy provides the permissions to `kms:GenerateDataKey` and `kms:Decrypt`:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "kms:GenerateDataKey",
        "kms:Decrypt"
      ],
      "Resource": [
        "arn:aws:kms:*:123456789012:key/12345678-1234-1234-1234-123456789012"
      ],
      "Sid": "AllowKMSGeneratedatakey"
    }
  ]
}
```

------

To launch the Spark or Tez UI, give your users, groups, or roles permissions to access the `emr-serverless:GetDashboardForJobRun` API as follows:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "emr-serverless:GetDashboardForJobRun"
      ],
      "Resource": [
        "*"
      ],
      "Sid": "AllowEMRSERVERLESSGetdashboardforjobrun"
    }
  ]
}
```

------

### Encryption key permissions for Amazon S3 and managed storage
<a name="jobs-log-encryption-permissions-s3"></a>

When you encrypt logs with your own encryption key either in managed storage or in your S3 buckets, configure KMS key permissions as follows.

The `emr-serverless.amazonaws.com` principal must have the following permissions in the policy for the KMS key:

```
{
    "Effect": "Allow",
    "Principal":{
       "Service": "emr-serverless.amazonaws.com" 
     },
     "Action": [
       "kms:Decrypt",
       "kms:GenerateDataKey"
      ],
     "Resource": "*"
     "Condition": {
       "StringLike": {
         "aws:SourceArn": "arn:aws:emr-serverless:region:aws-account-id:/applications/application-id"
       }
     }
 }
```

As a security best practice, we suggest that you add an `aws:SourceArn` condition key to the KMS key policy. The IAM global condition key `aws:SourceArn` helps ensure that EMR Serverless uses the KMS key only for an application ARN. 

The job runtime role must have the following permissions in its IAM policy:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "kms:GenerateDataKey",
        "kms:Decrypt"
      ],
      "Resource": [
        "arn:aws:kms:*:123456789012:key/12345678-1234-1234-1234-123456789012"
      ],
      "Sid": "AllowKMSGeneratedatakey"
    }
  ]
}
```

------

### Encryption key permissions for Amazon CloudWatch
<a name="jobs-log-encryption-permissions-cw"></a>

To associate the KMS key ARN to your log group, use the following IAM policy for the job runtime role.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:AssociateKmsKey"
      ],
      "Resource": [
        "arn:aws:logs:*:123456789012:log-group:my-log-group-name:*"
      ],
      "Sid": "AllowLOGSAssociatekmskey"
    }
  ]
}
```

------

Configure the KMS key policy to grant KMS permissions to Amazon CloudWatch:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Id": "key-default-1",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "kms:Decrypt",
        "kms:GenerateDataKey"
      ],
      "Resource": [
        "*"
      ],
      "Condition": {
        "ArnLike": {
          "kms:EncryptionContext:aws:logs:arn": "arn:aws:logs:*:123456789012:*"
        }
      },
      "Sid": "AllowKMSDecrypt"
    }
  ]
}
```

------

# Configure Apache Log4j2 properties for Amazon EMR Serverless
<a name="log4j2"></a>

This page describes how to configure custom [Apache Log4j 2.x](https://logging.apache.org/log4j/2.x/) properties for EMR Serverless jobs at `StartJobRun`. If you want to configure Log4j classifications at the application level, refer to [Default application configuration for EMR Serverless](default-configs.md).

## Configure Spark Log4j2 properties for Amazon EMR Serverless
<a name="log4j2-spark"></a>

With Amazon EMR releases 6.8.0 and higher, you can customize [Apache Log4j 2.x](https://logging.apache.org/log4j/2.x/) properties to specify fine-grained log configurations. This simplifies troubleshooting of your Spark jobs on EMR Serverless. To configure these properties, use the `spark-driver-log4j2` and `spark-executor-log4j2` classifications.

**Topics**
+ [

### Log4j2 classifications for Spark
](#log4j2-spark-class)
+ [

### Log4j2 configuration example for Spark
](#log4j2-spark-example)
+ [

### Log4j2 in sample Spark jobs
](#log4j2-spark-jobs)
+ [

### Log4j2 considerations for Spark
](#log4j2-spark-considerations)

### Log4j2 classifications for Spark
<a name="log4j2-spark-class"></a>

To customize the Spark log configurations, use the following classifications with [https://docs.aws.amazon.com/emr-serverless/latest/APIReference/API_ConfigurationOverrides.html#emrserverless-Type-ConfigurationOverrides-applicationConfiguration](https://docs.aws.amazon.com/emr-serverless/latest/APIReference/API_ConfigurationOverrides.html#emrserverless-Type-ConfigurationOverrides-applicationConfiguration). To configure the Log4j 2.x properties, use the following [https://docs.aws.amazon.com/emr-serverless/latest/APIReference/API_Configuration.html#emrserverless-Type-Configuration-properties](https://docs.aws.amazon.com/emr-serverless/latest/APIReference/API_Configuration.html#emrserverless-Type-Configuration-properties).

**`spark-driver-log4j2`**  
This classification sets the values in the `log4j2.properties` file for the driver.

**`spark-executor-log4j2`**  
This classification sets the values in the `log4j2.properties` file for the executor.

### Log4j2 configuration example for Spark
<a name="log4j2-spark-example"></a>

The following example shows how to submit a Spark job with `applicationConfiguration` to customize Log4j2 configurations for the Spark driver and executor.

To configure Log4j classifications at the application level instead of when you submit the job, refer to [Default application configuration for EMR Serverless](default-configs.md).

```
aws emr-serverless start-job-run \
    --application-id application-id \
    --execution-role-arn job-role-arn \
    --job-driver '{
        "sparkSubmit": {
            "entryPoint": "/usr/lib/spark/examples/jars/spark-examples.jar",
            "entryPointArguments": ["1"],
            "sparkSubmitParameters": "--class org.apache.spark.examples.SparkPi --conf spark.executor.cores=4 --conf spark.executor.memory=20g --conf spark.driver.cores=4 --conf spark.driver.memory=8g --conf spark.executor.instances=1"
        }
    }'
    --configuration-overrides '{
        "applicationConfiguration": [
             {
                "classification": "spark-driver-log4j2",
                "properties": {
                    "rootLogger.level":"error", // will only display Spark error logs
                    "logger.IdentifierForClass.name": "classpath for setting logger",
                    "logger.IdentifierForClass.level": "info"
                   
                }
            },
            {
                "classification": "spark-executor-log4j2",
                "properties": {
                    "rootLogger.level":"error", // will only display Spark error logs
                    "logger.IdentifierForClass.name": "classpath for setting logger",
                    "logger.IdentifierForClass.level": "info"
                }
            }
       ]
    }'
```

### Log4j2 in sample Spark jobs
<a name="log4j2-spark-jobs"></a>

The following code samples demonstrate how to create a Spark application while you initialize a custom Log4j2 configuration for the application.

------
#### [ Python ]

**Example - Using Log4j2 for a Spark job with Python**  

```
import os
import sys

from pyspark import SparkConf, SparkContext
from pyspark.sql import SparkSession

app_name = "PySparkApp"
if __name__ == "__main__":
    spark = SparkSession\
        .builder\
        .appName(app_name)\
        .getOrCreate()
    
    spark.sparkContext._conf.getAll()
    sc = spark.sparkContext
    log4jLogger = sc._jvm.org.apache.log4j
    LOGGER = log4jLogger.LogManager.getLogger(app_name)

    LOGGER.info("pyspark script logger info")
    LOGGER.warn("pyspark script logger warn")
    LOGGER.error("pyspark script logger error")
    
    // your code here
    
    spark.stop()
```
To customize Log4j2 for the driver when you execute a Spark job, use the following configuration:  

```
{
   "classification": "spark-driver-log4j2",
      "properties": {
          "rootLogger.level":"error", // only display Spark error logs
          "logger.PySparkApp.level": "info", 
          "logger.PySparkApp.name": "PySparkApp"
      }
}
```

------
#### [ Scala ]

**Example - Using Log4j2 for a Spark job with Scala**  

```
import org.apache.log4j.Logger
import org.apache.spark.sql.SparkSession

object ExampleClass {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession
    .builder
    .appName(this.getClass.getName)
    .getOrCreate()

    val logger = Logger.getLogger(this.getClass);
    logger.info("script logging info logs")
    logger.warn("script logging warn logs")
    logger.error("script logging error logs")

// your code here
    spark.stop()
  }
}
```
To customize Log4j2 for the driver when you execute a Spark job, use the following configuration:  

```
{
   "classification": "spark-driver-log4j2",
      "properties": {
          "rootLogger.level":"error", // only display Spark error logs
          "logger.ExampleClass.level": "info", 
          "logger.ExampleClass.name": "ExampleClass"
      }
}
```

------

### Log4j2 considerations for Spark
<a name="log4j2-spark-considerations"></a>

The following Log4j2.x properties are not configurable for Spark processes:
+ `rootLogger.appenderRef.stdout.ref`
+ `appender.console.type`
+ `appender.console.name`
+ `appender.console.target`
+ `appender.console.layout.type`
+ `appender.console.layout.pattern`

For detailed information about the Log4j2.x properties that configure, refer to the [`log4j2.properties.template` file](https://github.com/apache/spark/blob/v3.3.0/conf/log4j2.properties.template) on GitHub.

# Monitoring EMR Serverless
<a name="metrics"></a>

This section covers the ways that monitor your Amazon EMR Serverless applications and jobs.

**Topics**
+ [

# Monitoring EMR Serverless applications and jobs
](app-job-metrics.md)
+ [

# Monitor Spark metrics with Amazon Managed Service for Prometheus
](monitor-with-prometheus.md)
+ [

# EMR Serverless usage metrics
](monitoring-usage.md)

# Monitoring EMR Serverless applications and jobs
<a name="app-job-metrics"></a>

With Amazon CloudWatch metrics for EMR Serverless, you can receive 1-minute CloudWatch metrics and access CloudWatch dashboards to access near-real-time operations and performance of your EMR Serverless applications.

EMR Serverless sends metrics to CloudWatch every minute. EMR Serverless emits these metrics at the application level as well as the job, worker-type, and capacity-allocation-type levels.

To get started, use the EMR Serverless CloudWatch dashboard template provided in the [EMR Serverless GitHub repository](https://github.com/aws-samples/emr-serverless-samples/tree/main/cloudformation/emr-serverless-cloudwatch-dashboard/) and deploy it.

**Note**  
[EMR Serverless interactive workloads](interactive-workloads.md) have only application-level monitoring enabled, and have a new worker type dimension, `Spark_Kernel`. To monitor and debug your interactive workloads, access the logs and Apache Spark UI from [within your EMR Studio Workspace](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-studio-debug.html#emr-studio-debug-serverless).

## Monitoring metrics
<a name="app-job-metrics-versions"></a>

**Important**  
We are restructuring our metrics display to add `ApplicationName` and `JobName` as dimensions. For release 7.10 and later, the older metrics will no longer be updated. For EMR releases below 7.10, the older metrics are still available.

**Current dimensions**

The table below describes the EMR Serverless dimensions available within the `AWS/EMR Serverless` namespace.


**Dimensions for EMR Serverless metrics**  

| Dimension | Description | 
| --- | --- | 
| ApplicationId | Filters for all metrics of an EMR Serverless application using the application ID. | 
| ApplicationName | Filters for all metrics of an EMR Serverless application using the name. If the name isn't provided, or contains non-ASCII characters, it is published as **[Unspecified]**. | 
| JobId | Filters for all metrics of an EMR Serverless the job run ID. | 
| JobName | Filters for all metrics of an EMR Serverless job run using the name. If the name isn't provided, or contains non-ASCII characters, it is published as **[Unspecified]**. | 
| WorkerType | Filters for all metrics of a given worker type. For example, you can filter for `SPARK_DRIVER` and `SPARK_EXECUTORS` for Spark jobs. | 
| CapacityAllocationType | Filters for all metrics of a given capacity allocation type. For example, you can filter for `PreInitCapacity` for pre-initialized capacity and `OnDemandCapacity` for everything else. | 

## Application-level monitoring
<a name="app-level-metrics"></a>

You can monitor capacity usage at the EMR Serverless application level with Amazon CloudWatch metrics. You can also set up a single display to monitor application capacity usage in a CloudWatch dashboard.


**EMR Serverless application metrics**  

| Metric | Description | Unit | Dimension | 
| --- | --- | --- | --- | 
| MaxCPUAllowed |  The maximum CPU allowed for the application.  | vCPU | ApplicationId, ApplicationName | 
| MaxMemoryAllowed |  The maximum memory in GB allowed for the application.  | Gigabytes (GB) | ApplicationId, ApplicationName | 
| MaxStorageAllowed |  The maximum storage in GB allowed for the application.  | Gigabytes (GB) | ApplicationId, ApplicationName | 
| CPUAllocated |  The total numbers of vCPUs allocated.  | vCPU | ApplicationId, ApplicationName, WorkerType, CapacityAllocationType | 
| IdleWorkerCount |  The number of total workers idle.  | Count | ApplicationId, ApplicationName, WorkerType, CapacityAllocationType | 
| MemoryAllocated |  The total memory in GB allocated.  | Gigabytes (GB) | ApplicationId, ApplicationName, WorkerType, CapacityAllocationType | 
| PendingCreationWorkerCount |  The number of total workers pending creation.  | Count | ApplicationId, ApplicationName, WorkerType, CapacityAllocationType | 
| RunningWorkerCount |  The number of total workers in use by the application.  | Count | ApplicationId, ApplicationName, WorkerType, CapacityAllocationType | 
| StorageAllocated |  The total disk storage in GB allocated.  | Gigabytes (GB) | ApplicationId, ApplicationName, WorkerType, CapacityAllocationType | 
| TotalWorkerCount |  The number of total workers available.  | Count | ApplicationId, ApplicationName, WorkerType, CapacityAllocationType | 

## Job-level monitoring
<a name="job-level-metrics"></a>

Amazon EMR Serverless sends the following job-level metrics to Amazon CloudWatch every one minute. You can access the metric values for aggregate job runs by job run state. The unit for each of the metrics is *count*.


**EMR Serverless job-level metrics**  

| Metric | Description | Dimension | 
| --- | --- | --- | 
| SubmittedJobs | The number of jobs in a Submitted state. | ApplicationId, ApplicationName | 
| PendingJobs | The number of jobs in a Pending state. | ApplicationId, ApplicationName | 
| ScheduledJobs | The number of jobs in a Scheduled state. | ApplicationId, ApplicationName | 
| RunningJobs | The number of jobs in a Running state. | ApplicationId, ApplicationName | 
| SuccessJobs | The number of jobs in a Success state. | ApplicationId, ApplicationName | 
| FailedJobs | The number of jobs in a Failed state. | ApplicationId, ApplicationName | 
| CancellingJobs | The number of jobs in a Cancelling state. | ApplicationId, ApplicationName | 
| CancelledJobs | The number of jobs in a Cancelled state. | ApplicationId, ApplicationName | 

You can monitor engine-specific metrics for running and completed EMR Serverless jobs with engine-specific application UIs. When you access the UI for a running job, the live application UI displays with real-time updates. When you access the UI for a completed job, the persistent app UI displays.

**Running jobs**

For your running EMR Serverless jobs, access a real-time interface that provides engine-specific metrics. You can use either the Apache Spark UI or the Hive Tez UI to monitor and debug your jobs. To access these UIs, use the EMR Studio console or request a secure URL endpoint with the AWS Command Line Interface.

**Completed jobs**

For your completed EMR Serverless jobs, use the Spark History Server or the Persistent Hive Tez UI to access jobs details, stages, tasks, and metrics for Spark or Hive jobs runs. To access these UIs, use the EMR Studio console, or request a secure URL endpoint with the AWS Command Line Interface.

## Job worker-level monitoring
<a name="job-worker-level-metrics"></a>

Amazon EMR Serverless sends the following job worker level metrics that are available in the `AWS/EMRServerless` namespace and `Job Worker Metrics` metric group to Amazon CloudWatch. EMR Serverless collects data points from individual workers during job runs at the job level, worker-type, and the capacity-allocation-type level. You can use `ApplicationId` as a dimension to monitor multiple jobs that belong to the same application.

**Note**  
To view the total CPU and Memory used by an EMR Serverless job when viewing the metrics in the Amazon CloudWatch console, use the Statistic as Sum and Period as 1 minute.


**EMR Serverless job worker-level metrics**  

| Metric | Description | Unit | Dimension | 
| --- | --- | --- | --- | 
| WorkerCpuAllocated | The total numbers of vCPU cores allocated for workers in a job run. | vCPU | JobId, JobName, ApplicationId, ApplicationName, WorkerType, and CapacityAllocationType | 
| WorkerCpuUsed | The total numbers of vCPU cores utilized by workers in a job run. | vCPU | JobId, JobName, ApplicationId, ApplicationName, WorkerType, and CapacityAllocationType | 
| WorkerMemoryAllocated | The total memory in GB allocated for workers in a job run. | Gigabytes (GB) | JobId, JobName, ApplicationId, ApplicationName, WorkerType, and CapacityAllocationType | 
| WorkerMemoryUsed | The total memory in GB utilized by workers in a job run. | Gigabytes (GB) | JobId, JobName, ApplicationId, ApplicationName, WorkerType, and CapacityAllocationType | 
| WorkerEphemeralStorageAllocated | The number of bytes of ephemeral storage allocated for workers in a job run. | Gigabytes (GB) | JobId, JobName, ApplicationId, ApplicationName, WorkerType, and CapacityAllocationType | 
| WorkerEphemeralStorageUsed | The number of bytes of ephemeral storage used by workers in a job run. | Gigabytes (GB) | JobId, JobName, ApplicationId, ApplicationName, WorkerType, and CapacityAllocationType | 
| WorkerStorageReadBytes | The number of bytes read from storage by workers in a job run. | Bytes | JobId, JobName, ApplicationId, ApplicationName, WorkerType, and CapacityAllocationType | 
| WorkerStorageWriteBytes | The number of bytes written to storage from workers in a job run. | Bytes | JobId, JobName, ApplicationId, ApplicationName, WorkerType, and CapacityAllocationType | 

The steps below describe how to access the various types of metrics.

------
#### [ Console ]

**To access your application UI with the console**

1. Navigate to your EMR Serverless application on the EMR Studio with the instructions in [Getting started from the console](https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/getting-started.html#gs-console). 

1. To access engine-specific application UIs and logs for a running job: 

   1. Choose a job with a `RUNNING` status.

   1. Select the job on the **Application details** page, or navigate to the **Job details** page for your job.

   1. Under the **Display UI** dropdown menu, choose either **Spark UI** or **Hive Tez UI** to navigate to the application UI for your job type. 

   1. To access Spark engine logs, navigate to the **Executors** tab in the Spark UI, and choose the **Logs** link for the driver. To access Hive engine logs, choose the **Logs** link for the appropriate DAG in the Hive Tez UI.

1. To access engine-specific application UIs and logs for a completed job: 

   1. Choose a job with a `SUCCESS` status.

   1. Select the job on your application's **Application details** page or navigate to the job's **Job details** page.

   1. Under the **Display UI** dropdown menu, choose either **Spark History Server** or **Persistent Hive Tez UI** to navigate to the application UI for your job type. 

   1. To access Spark engine logs, navigate to the **Executors** tab in the Spark UI, and choose the **Logs** link for the driver. To access Hive engine logs, choose the **Logs** link for the appropriate DAG in the Hive Tez UI.

------
#### [ AWS CLI ]

**To access your application UI with the AWS CLI**
+ To generate a URL that use to access your application UI for running and completed jobs, call the `GetDashboardForJobRun` API. 

  ```
  aws emr-serverless get-dashboard-for-job-run /
  --application-id <application-id> /
  --job-run-id <job-id>
  ```

  The URL that you generate is valid for one hour.

------

# Monitor Spark metrics with Amazon Managed Service for Prometheus
<a name="monitor-with-prometheus"></a>

With Amazon EMR releases 7.1.0 and higher, you can integrate EMR Serverless with Amazon Managed Service for Prometheus to collect Apache Spark metrics for EMR Serverless jobs and applications. This integration is available when you submit a job or create an application using either the AWS console, the EMR Serverless API, or the AWS CLI.

## Prerequisites
<a name="monitoring-with-prometheus-prereqs"></a>

Before you can deliver your Spark metrics to Amazon Managed Service for Prometheus, complete the following prerequisites.
+ [Create an Amazon Managed Service for Prometheus workspace.](https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-onboard-create-workspace.html) This workspace serves as an ingestion endpoint. Make a note of the URL displayed for **Endpoint - remote write URL**. You'll need to specify the URL when you create your EMR Serverless application.
+ To grant access of your jobs to Amazon Managed Service for Prometheus for monitoring purposes, add the following policy to your job execution role.

  ```
  {
      "Sid": "AccessToPrometheus",
      "Effect": "Allow",
      "Action": ["aps:RemoteWrite"],
      "Resource": "arn:aws:aps:<AWS_REGION>:<AWS_ACCOUNT_ID>:workspace/<WORKSPACE_ID>"
  }
  ```

## Setup
<a name="monitoring-with-prometheus-setup"></a>

**To use the AWS console to create an application that's integrated with Amazon Managed Service for Prometheus**

1. See [Getting started with Amazon EMR Serverless](https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/getting-started.html                             ) to create an application.

1. While you're creating an application, choose **Use custom settings**, and then configure your application by specifying the information into the fields you want to configure.

1. Under **Application logs and metrics**, choose **Deliver engine metrics to Amazon Managed Service for Prometheus**, and then specify your remote write URL.

1. Specify any other configuration settings you want, and then choose **Create and start application**.

**Use the AWS CLI or EMR Serverless API**

You can also use the AWS CLI or EMR Serverless API to integrate your EMR Serverless application with Amazon Managed Service for Prometheus when you're running the `create-application` or the `start-job-run` commands.

------
#### [ create-application ]

```
aws emr-serverless create-application \
--release-label emr-7.1.0 \
--type "SPARK" \
--monitoring-configuration '{ 
    "prometheusMonitoringConfiguration": {
        "remoteWriteUrl": "https://aps-workspaces.<AWS_REGION>.amazonaws.com/workspaces/<WORKSPACE_ID>/api/v1/remote_write"
    }
}'
```

------
#### [ start-job-run ]

```
aws emr-serverless start-job-run \
--application-id <APPPLICATION_ID> \
--execution-role-arn <JOB_EXECUTION_ROLE> \
--job-driver '{
    "sparkSubmit": {
        "entryPoint": "local:///usr/lib/spark/examples/src/main/python/pi.py",
        "entryPointArguments": ["10000"],
        "sparkSubmitParameters": "--conf spark.dynamicAllocation.maxExecutors=10"
    }
}' \
--configuration-overrides '{
     "monitoringConfiguration": {
        "prometheusMonitoringConfiguration": {
            "remoteWriteUrl": "https://aps-workspaces.<AWS_REGION>.amazonaws.com/workspaces/<WORKSPACE_ID>/api/v1/remote_write"
        }
    }
}'
```

------

Including `prometheusMonitoringConfiguration` in your command indicates that EMR Serverless must run the Spark job with an agent that collects the Spark metrics and writes them to your `remoteWriteUrl` endpoint for Amazon Managed Service for Prometheus. You can then use the Spark metrics in Amazon Managed Service for Prometheus for visualization, alerts, and analysis.

## Advanced configuration properties
<a name="monitoring-with-prometheus-config-options"></a>

EMR Serverless uses a component within Spark named `PrometheusServlet` to collect Spark metrics and translates performance data into data that's compatible with Amazon Managed Service for Prometheus. By default, EMR Serverless sets default values in Spark and parses driver and executor metrics when you submit a job using `PrometheusMonitoringConfiguration`. 

The following table describes all of the properties configure when submitting a Spark job that sends metrics to Amazon Managed Service for Prometheus.


| Spark property | Default value | Description | 
| --- | --- | --- | 
| spark.metrics.conf.\$1.sink.prometheusServlet.class | org.apache.spark.metrics.sink.PrometheusServlet | The class that Spark uses to send metrics to Amazon Managed Service for Prometheus. To override the default behavior, specify your own custom class. | 
| spark.metrics.conf.\$1.source.jvm.class | org.apache.spark.metrics.source.JvmSource | The class Spark uses to collect and send crucial metrics from the underlying Java virtual machine. To stop collecting JVM metrics, disable this property by setting it to an empty string, such as `""`. To override the default behavior, specify your own custom class.  | 
| spark.metrics.conf.driver.sink.prometheusServlet.path | /metrics/prometheus | The distinct URL that Amazon Managed Service for Prometheus uses to collect metrics from the driver. To override the default behavior, specify your own path. To stop collecting driver metrics, disable this property by setting it to an empty string, such as `""`. | 
| spark.metrics.conf.executor.sink.prometheusServlet.path | /metrics/executor/prometheus | The distinct URL that Amazon Managed Service for Prometheus uses to collect metrics from the executor. To override the default behavior, specify your own path. To stop collecting executor metrics, disable this property by setting it to an empty string, such as `""`. | 

For more information about the Spark metrics, refer to [Apache Spark metrics](https://spark.apache.org/docs/3.5.0/monitoring.html#metrics).

## Considerations and limitations
<a name="monitoring-with-prometheus-limitations"></a>

When using Amazon Managed Service for Prometheus to collect metrics from EMR Serverless, consider the following considerations and limitations.
+ Support for using Amazon Managed Service for Prometheus with EMR Serverless is available only in the [AWS Regions where Amazon Managed Service for Prometheus is generally available.](https://docs.aws.amazon.com/general/latest/gr/prometheus-service.html)
+ Running the agent to collect Spark metrics on Amazon Managed Service for Prometheus requires more resources from workers. If you choose a smaller worker size, such as one vCPU worker, your job run time might increase.
+ Support for using Amazon Managed Service for Prometheus with EMR Serverless is available only for Amazon EMR releases 7.1.0 and higher.
+ Amazon Managed Service for Prometheus must be deployed in the same account where you run EMR Serverless to collect metrics.

# EMR Serverless usage metrics
<a name="monitoring-usage"></a>

You can use Amazon CloudWatch usage metrics to provide visibility into the resources that your account uses. Use these metrics to visualize your service usage on CloudWatch graphs and dashboards.

EMR Serverless usage metrics correspond to Service Quotas. You can configure alarms that alert you when your usage approaches a service quota. For more information, refer to [Service Quotas and Amazon CloudWatch alarms](https://docs.aws.amazon.com/servicequotas/latest/userguide/configure-cloudwatch.html) in the *Service Quotas User Guide*.

For more information about EMR Serverless service quotas, refer to [Endpoints and quotas for EMR Serverless](endpoints-quotas.md).

## Service quota usage metrics for EMR Serverless
<a name="usage-metrics"></a>

EMR Serverless publishes the following service quota usage metrics in the `AWS/Usage` namespace.


****  

| Metric | Description | 
| --- | --- | 
| `ResourceCount`  | The total number of the specified resource that is running on your account. The resource is defined by the [dimensions](#usage-metrics-dimensions) that are associated with the metric. | 

## Dimensions for EMR Serverless service quota usage metrics
<a name="usage-metrics-dimensions"></a>

You can use the following dimensions to refine the usage metrics that EMR Serverless publishes.


****  

| Dimension | Value | Description | 
| --- | --- | --- | 
|  `Service`  |  EMR Serverless  | The name of the AWS service that contains the resource. | 
|  `Type`  |  Resource  | The type of entity that EMR Serverless is reporting. | 
|  `Resource`  |  vCPU  | The type of resource that EMR Serverless is tracking. | 
|  `Class`  |  None  | The class of resource that EMR Serverless is tracking. | 

# Automating EMR Serverless with Amazon EventBridge
<a name="using-eventbridge"></a>

You can use Amazon EventBridge to automate your AWS services and respond automatically to system events, such as application availability issues or resource changes. EventBridge delivers a near real-time stream of system events that describe changes in your AWS resources. You can write simple rules to indicate which events are of interest to you, and what automated actions to take when an event matches a rule. With EventBridge, you can automatically:
+ Invoke an AWS Lambda function
+ Relay an event to Amazon Kinesis Data Streams
+ Activate an AWS Step Functions state machine
+ Notify an Amazon SNS topic or an Amazon SQS queue

For example, when you use EventBridge with EMR Serverless, you can activate an AWS Lambda function when an ETL job succeed or notify an Amazon SNS topic when an ETL job fails.

EMR Serverless emits four kinds of events:
+ Application state change events – Events that emit every state change of an application. For more information about application states, refer to [Application states](applications.md#application-states).
+ Job run state change events – Events that emit every state change of a job run. For more information about, refer to [Job run states](job-states.md).
+ Job run retry events – Events that emit every retry of a job run from Amazon EMR Serverless releases 7.1.0 and higher.
+ Job resource utilization update events – Events that emit resource utilization updates for a job run at close to 30-minute intervals.

## Sample EMR Serverless EventBridge events
<a name="using-eventbridge-examples"></a>

Events reported by EMR Serverless have a value of `aws.emr-serverless` assigned to `source`, as in the following examples.

**Application state change event**

The following example event shows an application in the `CREATING` state.

```
{
    "version": "0",
    "id": "9fd3cf79-1ff1-b633-4dd9-34508dc1e660",
    "detail-type": "EMR Serverless Application State Change",
    "source": "aws.emr-serverless",
    "account": "123456789012",
    "time": "2022-05-31T21:16:31Z",
    "region": "us-east-1",
    "resources": [],
    "detail": {
        "applicationId": "00f1cbsc6anuij25",
        "applicationName": "3965ad00-8fba-4932-a6c8-ded32786fd42",
        "arn": "arn:aws:emr-serverless:us-east-1:111122223333:/applications/00f1cbsc6anuij25",
        "releaseLabel": "emr-6.6.0",
        "state": "CREATING",
        "type": "HIVE",
        "createdAt": "2022-05-31T21:16:31.547953Z",
        "updatedAt": "2022-05-31T21:16:31.547970Z",
        "autoStopConfig": {
            "enabled": true,
            "idleTimeout": 15
        },
        "autoStartConfig": {
            "enabled": true
        }
    }
}
```

**Job run state change event**

The following example event shows a job run that moves from the `SCHEDULED` state to the `RUNNING` state.

```
{
    "version": "0",
    "id": "00df3ec6-5da1-36e6-ab71-20f0de68f8a0",
    "detail-type": "EMR Serverless Job Run State Change",
    "source": "aws.emr-serverless",
    "account": "123456789012",
    "time": "2022-05-31T21:07:42Z",
    "region": "us-east-1",
    "resources": [],
    "detail": {
        "jobRunId": "00f1cbn5g4bb0c01",
        "applicationId": "00f1982r1uukb925",
        "arn": "arn:aws:emr-serverless:us-east-1:123456789012:/applications/00f1982r1uukb925/jobruns/00f1cbn5g4bb0c01",
        "releaseLabel": "emr-6.6.0",
        "state": "RUNNING",
        "previousState": "SCHEDULED",
        "createdBy": "arn:aws:sts::123456789012:assumed-role/TestRole-402dcef3ad14993c15d28263f64381e4cda34775/6622b6233b6d42f59c25dd2637346242",
        "updatedAt": "2022-05-31T21:07:42.299487Z",
        "createdAt": "2022-05-31T21:07:25.325900Z"
    }
}
```

**Job run retry event**

The following is an example of a job run retry event.

```
{
    "version": "0",
    "id": "00df3ec6-5da1-36e6-ab71-20f0de68f8a0",
    "detail-type": "EMR Serverless Job Run Retry",
    "source": "aws.emr-serverless",
    "account": "123456789012",
    "time": "2022-05-31T21:07:42Z",
    "region": "us-east-1",
    "resources": [],
    "detail": {
        "jobRunId": "00f1cbn5g4bb0c01",
        "applicationId": "00f1982r1uukb925",
        "arn": "arn:aws:emr-serverless:us-east-1:123456789012:/applications/00f1982r1uukb925/jobruns/00f1cbn5g4bb0c01",
        "releaseLabel": "emr-6.6.0",
        "createdBy": "arn:aws:sts::123456789012:assumed-role/TestRole-402dcef3ad14993c15d28263f64381e4cda34775/6622b6233b6d42f59c25dd2637346242",
        "updatedAt": "2022-05-31T21:07:42.299487Z",
        "createdAt": "2022-05-31T21:07:25.325900Z",
        //Attempt Details
        "previousAttempt": 1,
        "previousAttemptState": "FAILED",
        "previousAttemptCreatedAt": "2022-05-31T21:07:25.325900Z",
        "previousAttemptEndedAt": "2022-05-31T21:07:30.325900Z",
        "newAttempt": 2,
        "newAttemptCreatedAt": "2022-05-31T21:07:30.325900Z"
    }
}
```

**Job Resource Utilization Update**

The following example event shows the final resource utilization update for a job that moved to a terminal state after running.

```
{
    "version": "0",
    "id": "00df3ec6-5da1-36e6-ab71-20f0de68f8a0",
    "detail-type": "EMR Serverless Job Resource Utilization Update",
    "source": "aws.emr-serverless",
    "account": "123456789012",
    "time": "2022-05-31T21:07:42Z",
    "region": "us-east-1",
    "resources": [
        "arn:aws:emr-serverless:us-east-1:123456789012:/applications/00f1982r1uukb925/jobruns/00f1cbn5g4bb0c01"
    ],
    "detail": {
        "applicationId": "00f1982r1uukb925",
        "jobRunId": "00f1cbn5g4bb0c01",
        "attempt": 1,
        "mode": "BATCH",
        "createdAt": "2022-05-31T21:07:25.325900Z",
        "startedAt": "2022-05-31T21:07:26.123Z",
        "calculatedFrom": "2022-05-31T21:07:42.299487Z",
        "calculatedTo": "2022-05-31T21:07:30.325900Z",
        "resourceUtilizationFinal": true,
        "resourceUtilizationForInterval": {
            "vCPUHour": 0.023,
            "memoryGBHour": 0.114,
            "storageGBHour": 0.228
        },
        "billedResourceUtilizationForInterval": {
            "vCPUHour": 0.067,
            "memoryGBHour": 0.333,
            "storageGBHour": 0
        },
        "totalResourceUtilization": {
            "vCPUHour": 0.023,
            "memoryGBHour": 0.114,
            "storageGBHour": 0.228
        },
        "totalBilledResourceUtilization": {
            "vCPUHour": 0.067,
            "memoryGBHour": 0.333,
            "storageGBHour": 0
        }
    }
}
```

The **startedAt** field will only be present in the event if the job had moved to a running state.