# Working with Ray jobs in AWS Glue
<a name="ray-jobs-section"></a>

**Important**  
AWS Glue for Ray will no longer be open to new customers starting April 30, 2026. If you would like to use AWS Glue for Ray, sign up prior to that date. Existing customers can continue to use the service as normal. For capabilities similar to for AWS Glue for Ray, explore Amazon EKS. For more information, see [AWS Glue for Ray end of support](https://docs.aws.amazon.com/glue/latest/dg/awsglue-ray-jobs-availability-change.html).

This section provides information about using AWS Glue for Ray jobs. For more information about writing AWS Glue for Ray scripts, consult the [Programming Ray scripts](aws-glue-programming-ray.md) section.

**Topics**
+ [

## Getting started with AWS Glue for Ray
](#author-job-ray-using)
+ [

## Supported Ray runtime environments
](#author-job-ray-runtimes)
+ [

## Accounting for workers in Ray jobs
](#author-job-ray-worker-accounting)
+ [

# Using job parameters in Ray jobs
](author-job-ray-job-parameters.md)
+ [

# Monitoring Ray jobs with metrics
](author-job-ray-monitor.md)

## Getting started with AWS Glue for Ray
<a name="author-job-ray-using"></a>

To work with AWS Glue for Ray, you use the same AWS Glue jobs and interactive sessions that you use with AWS Glue for Spark. AWS Glue jobs are designed for running the same script on a recurring cadence, while interactive sessions are designed to let you run snippets of code sequentially against the same provisioned resources. 

AWS Glue ETL and Ray are different underneath, so in your script, you have access to different tools, features, and configuration. As a new computation framework managed by AWS Glue, Ray has a different architecture and uses different vocabulary to describe what it does. For more information, see [Architecture Whitepapers](https://docs.ray.io/en/latest/ray-contribute/whitepaper.html) in the Ray documentation. 

**Note**  
AWS Glue for Ray is available in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland).

### Ray jobs in the AWS Glue Studio console
<a name="author-job-ray-using-console"></a>

On the **Jobs** page in the AWS Glue Studio console, you can select a new option when you're creating a job in AWS Glue Studio—**Ray script editor**. Choose this option to create a Ray job in the console. For more information about jobs and how they're used, see [Building visual ETL jobs](author-job-glue.md).

![\[The Jobs page in AWS Glue Studio with the Ray script editor option selected.\]](http://docs.aws.amazon.com/glue/latest/dg/images/ray_job_setup.png)


### Ray jobs in the AWS CLI and SDK
<a name="author-job-ray-using-cli"></a>

Ray jobs in the AWS CLI use the same SDK actions and parameters as other jobs. AWS Glue for Ray introduces new values for certain parameters. For more information in the Jobs API, see [Jobs](aws-glue-api-jobs-job.md).

## Supported Ray runtime environments
<a name="author-job-ray-runtimes"></a>

In Spark jobs, `GlueVersion` determines the versions of Apache Spark and Python available in an AWS Glue for Spark job. The Python version indicates the version that is supported for jobs of type Spark. This is not how Ray runtime environments are configured.

For Ray jobs, you should set `GlueVersion` to `4.0` or greater. However, the versions of Ray, Python, and additional libraries that are available in your Ray job are determined by the `Runtime` field in the job definition.

The `Ray2.4` runtime environment will be available for a minimum of 6 months after release. As Ray rapidly evolves, you will be able to incorporate Ray updates and improvements through future runtime environment releases.

Valid values: `Ray2.4`


| Runtime value | Ray and Python versions | 
| --- | --- | 
| Ray2.4 (for AWS Glue 4.0\$1) |  Ray 2.4.0 Python 3.9  | 

**Additional information**
+ For release notes that accompany AWS Glue on Ray releases, see [AWS Glue versions](release-notes.md#release-notes-versions).
+ For Python libraries that are provided in a runtime environment, see [Modules provided with Ray jobs](edit-script-ray-env-dependencies.md#edit-script-ray-modules-provided).

## Accounting for workers in Ray jobs
<a name="author-job-ray-worker-accounting"></a>

AWS Glue runs Ray jobs on new Graviton-based EC2 worker types, which are only available for Ray jobs. To appropriately provision these workers for the workloads Ray is designed for, we provide a different ratio of compute resources to memory resources from most workers. In order to account for these resources, we use the memory-optimized data processing unit (M-DPU) rather than the standard data processing unit (DPU).
+ One M-DPU corresponds to 4 vCPUs and 32 GB of memory.
+ One DPU corresponds to 4 vCPUs and 16 GB of memory. DPUs are used to account for resources in AWS Glue with Spark jobs and corresponding workers.

Ray jobs currently have access to one worker type, `Z.2X`. The `Z.2X` worker maps to 2 M-DPUs (8 vCPUs, 64 GB of memory) and has 128 GB of disk space. A `Z.2X` machine provides 8 Ray workers (one per vCPU).

The number of M-DPUs that you can use concurrently in an account is subject to a service quota. For more information about your AWS Glue account limits, see [AWS Glue endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/glue.html).

You specify the number of worker nodes that are available to a Ray job with `--number-of-workers (NumberOfWorkers)` in the job definition. For more information about Ray values in the Jobs API, see [Jobs](aws-glue-api-jobs-job.md).

You can further specify a minimum number of workers that a Ray job must allocate with the `--min-workers` job parameter. For more information about job parameters, see [Reference](author-job-ray-job-parameters.md#author-job-ray-parameters-reference). 

# Using job parameters in Ray jobs
<a name="author-job-ray-job-parameters"></a>

**Important**  
AWS Glue for Ray will no longer be open to new customers starting April 30, 2026. If you would like to use AWS Glue for Ray, sign up prior to that date. Existing customers can continue to use the service as normal. For capabilities similar to for AWS Glue for Ray, explore Amazon EKS. For more information, see [AWS Glue for Ray end of support](https://docs.aws.amazon.com/glue/latest/dg/awsglue-ray-jobs-availability-change.html).

You set arguments for AWS Glue Ray jobs the same way you set arguments for AWS Glue for Spark jobs. For more information about the AWS Glue API, see [Jobs](aws-glue-api-jobs-job.md). You can configure AWS Glue Ray jobs with different arguments, which are listed in this reference. You can also provide your own arguments. 

You can configure a job through the console, on the **Job details** tab, under the **Job Parameters** heading. You can also configure a job through the AWS CLI by setting `DefaultArguments` on a job, or setting `Arguments` on a job run. Default arguments and job parameters stay with the job through multiple runs. 

For example, the following is the syntax for running a job using `--arguments` to set a special parameter.

```
$ aws glue start-job-run --job-name "CSV to CSV" --arguments='--scriptLocation="s3://my_glue/libraries/test_lib.py",--test-environment="true"'
```

After you set the arguments, you can access job parameters from within your Ray job through environment variables. This gives you a way to configure your job for each run. The name of the environment variable will be the job argument name without the `--` prefix. 

For instance, in the previous example, the variable names would be `scriptLocation` and `test-environment`. You would then retrieve the argument through methods available in the standard library: `test_environment = os.environ.get('test-environment')`. For more information about accessing environment variables with Python, see [os module](https://docs.python.org/3/library/os.html) in the Python documentation.

## Configure how Ray jobs generate logs
<a name="author-job-ray-logging-configuration"></a>

By default, Ray jobs generate logs and metrics that are sent to CloudWatch and Amazon S3. You can use the `--logging_configuration` parameter to alter how logs are generated, currently you can use it to stop Ray jobs from generating various types of logs. This parameter takes a JSON object, whose keys correspond to the logs/behaviors you would like to alter. It supports the following keys:
+ `CLOUDWATCH_METRICS` – Configures CloudWatch metrics series that can be used to visualize job health. For more information about metrics, see [Monitoring Ray jobs with metrics](author-job-ray-monitor.md).
+ `CLOUDWATCH_LOGS` – Configures CloudWatch logs that provide Ray application level details about the status the job run. For more information about logs, see [Troubleshooting AWS Glue for Ray errors from logs](troubleshooting-ray.md).
+ `S3` – Configures what AWS Glue writes to Amazon S3, primarily similar information to CloudWatch logs but as files rather than log streams.

To disable a Ray logging behavior, provide the value `{\"IS_ENABLED\": \"False\"}`. For example, to disable CloudWatch metrics and CloudWatch logs, provide the following configuration:

```
"--logging_configuration": "{\"CLOUDWATCH_METRICS\": {\"IS_ENABLED\": \"False\"}, \"CLOUDWATCH_LOGS\": {\"IS_ENABLED\": \"False\"}}"
```

## Reference
<a name="author-job-ray-parameters-reference"></a>

 Ray jobs recognize the following argument names that you can use to set up the script environment for your Ray jobs and job runs:
+ `--logging_configuration` – Used to stop the generation of various logs created by Ray jobs. These logs are generated by default on all Ray jobs. Format: String-escaped JSON object. For more information, see [Configure how Ray jobs generate logs](#author-job-ray-logging-configuration).
+ `--min-workers` – The minimum number of worker nodes that are allocated to a Ray job. A worker node can run multiple replicas, one per virtual CPU. Format: integer. Minimum: 0. Maximum: value specified in `--number-of-workers (NumberOfWorkers)` on the job definition. For more information about accounting for worker nodes, see [Accounting for workers in Ray jobs](ray-jobs-section.md#author-job-ray-worker-accounting).
+ `--object_spilling_config` – AWS Glue for Ray supports using Amazon S3 as a way of extending the space available to Ray's object store. To enable this behavior, you can provide Ray an *object spilling*JSON config object with this parameter. For more information about Ray object spilling configuration, see [Object Spilling](https://docs.ray.io/en/latest/ray-core/objects/object-spilling.html) in the Ray documentation. Format: JSON object.

  AWS Glue for Ray only supports spilling to disk or spilling to Amazon S3 at once. You can provide multiple locations for spilling, as long as they respect this limitation. When spilling to Amazon S3, you will also need to add IAM permissions to your job for this bucket.

  When providing a JSON object as configuration with the CLI, you must provide it as a string, with the JSON object string-escaped. For example, a string value for spilling to one Amazon S3 path would look like: `"{\"type\": \"smart_open\", \"params\": {\"uri\":\"s3path\"}}"`. In AWS Glue Studio, provide this parameter as a JSON object with no extra formatting. 
+ `--object_store_memory_head` – The memory allocated to the Plasma object store on the Ray head node. This instance runs cluster management services, as well as worker replicas. The value represents a percentage of free memory on the instance after a warm start. You use this parameter to tune memory intensive workloads—defaults are acceptable for most use cases. Format: positive integer. Minimum: 1. Maximum: 100.

  For more information about Plasma, see [The Plasma In-Memory Object Store](https://ray-project.github.io/2017/08/08/plasma-in-memory-object-store.html) in the Ray documentation.
+ `--object_store_memory_worker` – The memory allocated to the Plasma object store on the Ray worker nodes. These instances only run worker replicas. The value represents a percentage of free memory on the instance after a warm start. This parameter is used to tune memory intensive workloads—defaults are acceptable for most use cases. Format: positive integer. Minimum: 1. Maximum: 100.

  For more information about Plasma, see [The Plasma In-Memory Object Store](https://ray-project.github.io/2017/08/08/plasma-in-memory-object-store.html) in the Ray documentation.
+ `--pip-install` – A set of Python packages to be installed. You  can install packages from PyPI using this argument. Format: comma-delimited  list.

  A PyPI package entry is in the format `package==version`, with the PyPI name and  version of your target package. Entries use Python version matching to match the package and version, such as `==`, not the single equals `=`. There are other  version-matching operators. For more information, see [PEP 440](https://peps.python.org/pep-0440/#version-matching) on the Python website. You can also provide custom modules with `--s3-py-modules`. 
+ `--s3-py-modules` – A set of Amazon S3 paths that host Python module distributions. Format: comma-delimited list.

  You can use this to distribute your own modules to your Ray job. You can also provide modules from PyPI with `--pip-install`. Unlike with AWS Glue ETL, custom modules are not set up through pip, but are passed to Ray for distribution. For more information, see [Additional Python modules for Ray jobs](edit-script-ray-env-dependencies.md#edit-script-ray-python-libraries-additional).
+ `--working-dir` – A path to a .zip file hosted in Amazon S3 that contains files to be distributed to all nodes running your Ray job. Format: string. For more information, see [Providing files to your Ray job](edit-script-ray-env-dependencies.md#edit-script-ray-working-directory).

# Monitoring Ray jobs with metrics
<a name="author-job-ray-monitor"></a>

**Important**  
AWS Glue for Ray will no longer be open to new customers starting April 30, 2026. If you would like to use AWS Glue for Ray, sign up prior to that date. Existing customers can continue to use the service as normal. For capabilities similar to for AWS Glue for Ray, explore Amazon EKS. For more information, see [AWS Glue for Ray end of support](https://docs.aws.amazon.com/glue/latest/dg/awsglue-ray-jobs-availability-change.html).

You can monitor Ray jobs using AWS Glue Studio and Amazon CloudWatch. CloudWatch collects and processes raw metrics from AWS Glue with Ray, which makes them available for analysis. These metrics are visualized in the AWS Glue Studio console, so you can monitor your job as it runs.

For a general overview of how to monitor AWS Glue, see [Monitoring AWS Glue using Amazon CloudWatch metrics](monitoring-awsglue-with-cloudwatch-metrics.md). For a general overview of how to use CloudWatch metrics that are published by AWS Glue, see [Monitoring with Amazon CloudWatch](monitor-cloudwatch.md).

## Monitoring Ray jobs in the AWS Glue console
<a name="author-job-ray-monitor-console"></a>

On the details page for a job run, below the **Run details** section, you can view pre-built aggregated graphs that visualize your available job metrics. AWS Glue Studio sends job metrics to CloudWatch for every job run. With these, you can build a profile of your cluster and tasks, as well as access detailed information about each node.

For more information about available metrics graphs, see [Viewing Amazon CloudWatch metrics for a Ray job run](view-job-runs.md#monitoring-job-run-metrics-ray).

## Overview of Ray jobs metrics in CloudWatch
<a name="author-job-ray-monitor-cw"></a>

We publish Ray metrics when detailed monitoring is enabled in CloudWatch. Metrics are published to the `Glue/Ray` CloudWatch namespace.
+ **Instance metrics**

  We publish metrics about the CPU, memory and disk utilization of instances assigned to a job. These metrics are identified by features such as `ExecutorId`, `ExecutorType` and `host`. These metrics are a subset of the standard Linux CloudWatch agent metrics. You can find information about metric names and features in the CloudWatch documentation. For more information, see [Metrics collected by the CloudWatch agent](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/metrics-collected-by-CloudWatch-agent.html).
+ **Ray cluster metrics**

  We forward metrics from the Ray processes that run your script to this namespace, then provide those most critical for you. The metrics that are available might differ by Ray version. For more information about which Ray version your job is running, see [AWS Glue versions](release-notes.md). 

  Ray collects metrics at the instance level. It also provides metrics for tasks and the cluster. For more information about Ray's underlying metric strategy, see [Metrics](https://docs.ray.io/en/latest/ray-observability/ray-metrics.html#system-metrics) in the Ray documentation.

**Note**  
 We don't publish Ray metrics to the `Glue/Job Metrics/` namespace, which is only used for AWS Glue ETL jobs.