

# Use job attachments to share files
<a name="build-job-attachments"></a>

Use *job attachments* to make files not in shared directories available for your jobs, and to capture the output files if they are not written to shared directories. Job attachments uses Amazon S3 to shuttle files between hosts. Files are stored in S3 buckets, and you don't need to upload a file if its content hasn't changed.

You must use job attachments when running jobs on [service-managed fleets](https://docs.aws.amazon.com/deadline-cloud/latest/userguide/smf-manage.html) because hosts don't share file system locations. Job attachments are also useful with [customer-managed fleets](https://docs.aws.amazon.com/deadline-cloud/latest/userguide/manage-cmf.html) when a job’s input or output files stored on a shared network file system, such as when your [job bundle](https://docs.aws.amazon.com/deadline-cloud/latest/userguide/submit-job-bundle.html) contains shell or Python scripts. 

 When you submit a job bundle with either the [Deadline Cloud CLI](https://pypi.org/project/deadline/) or a Deadline Cloud submitter, job attachments use the job’s storage profile and the queue’s required file system locations to identify the input files that are not on a worker host and should be uploaded to Amazon S3 as part of job submission. These storage profiles also help Deadline Cloud identify the output files in worker host locations that must be uploaded to Amazon S3 so that they are available to your workstation. 

 The job attachments examples use the farm, fleet, queues, and storage profiles configurations from [Sample project infrastructure](sample-project-infrastructure.md) and [Storage profiles and path mapping](storage-profiles-and-path-mapping.md). You should go through those sections before this one. 

In the following examples, you use a sample job bundle as a starting point, then modify it to explore job attachment’s functionality. Job bundles are the best way for your jobs to use job attachments. They combine an [Open Job Description](https://github.com/OpenJobDescription/openjd-specifications/wiki) job template in a directory with additional files that list the files and directories required by jobs using the job bundle. For more information about job bundles, see [Open Job Description (OpenJD) templates for Deadline Cloud](build-job-bundle.md).

# Submitting files with a job
<a name="submitting-files-with-a-job"></a>

With Deadline Cloud, you can enable job workflows to access input files that are unavailable in shared file system locations on worker hosts. Job attachments allow rendering jobs to access files residing only on a local workstation drive or a service-managed fleet environment. When submitting a job bundle, you can include lists of input files and directories required by the job. Deadline Cloud identifies these non-shared files, uploads them from the local machine to Amazon S3, and downloads them to the worker host. It streamlines the process of transferring input assets to render nodes, ensuring all required files are accessible for distributed job execution.

You can specify the files for jobs directly in the job bundle, use parameters in the job template that you provide using environment variables or a script, and use the job's `assets_references` file. You can use one of these methods or a combination of all three. You can specify a storage profile for the bundle for the job so that it only uploads files that have changed on the local workstation.

This section uses an example job bundle from GitHub to demonstrate how Deadline Cloud identifies the files in your job to upload, how those files are organized in Amazon S3, and how they are made available to the worker hosts processing your jobs. 

**Topics**
+ [How Deadline Cloud uploads files to Amazon S3](what-job-attachments-uploads-to-amazon-s3.md)
+ [How Deadline Cloud chooses the files to upload](how-job-attachments-decides-what-to-upload-to-amazon-s3.md)
+ [How jobs find job attachment input files](how-jobs-find-job-attachments-input-files.md)

# How Deadline Cloud uploads files to Amazon S3
<a name="what-job-attachments-uploads-to-amazon-s3"></a>

This example shows how Deadline Cloud uploads files from your workstation or worker host to Amazon S3 so that they can be shared. It uses a sample job bundle from GitHub and the Deadline Cloud CLI to submit jobs.

 Start by cloning the [Deadline Cloud samples GitHub repository](https://github.com/aws-deadline/deadline-cloud-samples) into your [AWS CloudShell](https://docs.aws.amazon.com/cloudshell/latest/userguide/welcome.html) environment, then copy the `job_attachments_devguide` job bundle into your home directory: 

```
git clone https://github.com/aws-deadline/deadline-cloud-samples.git
cp -r deadline-cloud-samples/job_bundles/job_attachments_devguide ~/
```

 Install the [Deadline Cloud CLI](https://pypi.org/project/deadline/) to submit job bundles: 

```
pip install deadline --upgrade
```

 The `job_attachments_devguide` job bundle has a single step with a task that runs a bash shell script whose file system location is passed as a job parameter. The job parameter’s definition is: 

```
...
- name: ScriptFile
  type: PATH
  default: script.sh
  dataFlow: IN
  objectType: FILE
...
```

 The `dataFlow` property’s `IN` value tells job attachments that the value of the `ScriptFile` parameter is an input to the job. The value of the `default` property is a relative location to the job bundle’s directory, but it can also be an absolute path. This parameter definition declares the `script.sh` file in the job bundle’s directory as an input file required for the job to run. 

 Next, make sure that the Deadline Cloud CLI does not have a storage profile configured then submit the job to queue `Q1`: 

```
# Change the value of FARM_ID to your farm's identifier
FARM_ID=farm-00112233445566778899aabbccddeeff
# Change the value of QUEUE1_ID to queue Q1's identifier
QUEUE1_ID=queue-00112233445566778899aabbccddeeff

deadline config set settings.storage_profile_id ''

deadline bundle submit --farm-id $FARM_ID --queue-id $QUEUE1_ID job_attachments_devguide/
```

 The output from the Deadline Cloud CLI after this command is run looks like: 

```
Submitting to Queue: Q1
...
Hashing Attachments  [####################################]  100%
Hashing Summary:
    Processed 1 file totaling 39.0 B.
    Skipped re-processing 0 files totaling 0.0 B.
    Total processing time of 0.0327 seconds at 1.19 KB/s.

Uploading Attachments  [####################################]  100%
Upload Summary:
    Processed 1 file totaling 39.0 B.
    Skipped re-processing 0 files totaling 0.0 B.
    Total processing time of 0.25639 seconds at 152.0 B/s.

Waiting for Job to be created...
Submitted job bundle:
   job_attachments_devguide/
Job creation completed successfully
job-74148c13342e4514b63c7a7518657005
```

When you submit the job, Deadline Cloud first hashes the `script.sh` file and then it uploads it to Amazon S3. 

Deadline Cloud treats the S3 bucket as content-addressable storage. Files are uploaded to S3 objects. The object name is derived from a hash of the file’s contents. If two files have identical contents they have the same hash value regardless of where the files are located or what they are named. This content-addressable storage enables Deadline Cloud to avoid uploading a file if it is already available.

 You can use the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html) to see the objects that were uploaded to Amazon S3: 

```
# The name of queue `Q1`'s job attachments S3 bucket
Q1_S3_BUCKET=$(
  aws deadline get-queue --farm-id $FARM_ID --queue-id $QUEUE1_ID \
    --query 'jobAttachmentSettings.s3BucketName' | tr -d '"'
)

aws s3 ls s3://$Q1_S3_BUCKET --recursive
```

 Two objects were uploaded to S3: 
+  `DeadlineCloud/Data/87cb19095dd5d78fcaf56384ef0e6241.xxh128` – The contents of `script.sh`. The value `87cb19095dd5d78fcaf56384ef0e6241` in the object key is the hash of the file’s contents, and the extension `xxh128` indicates that the hash value was calculated as a 128 bit [xxhash](https://xxhash.com/). 
+  `DeadlineCloud/Manifests/<farm-id>/<queue-id>/Inputs/<guid>/a1d221c7fd97b08175b3872a37428e8c_input` – The manifest object for the job submission. The values `<farm-id>`, `<queue-id>`, and `<guid>` are your farm identifier, queue identifier, and a random hexidecimal value. The value `a1d221c7fd97b08175b3872a37428e8c` in this example is a hash value calculated from the string `/home/cloudshell-user/job_attachments_devguide`, the directory where `script.sh` is located. 

 The manifest object contains the information for the input files on a specific root path uploaded to S3 as part of the job’s submission. Download this manifest file (`aws s3 cp s3://$Q1_S3_BUCKET/<objectname>`). Its contents are similar to: 

```
{
    "hashAlg": "xxh128",
    "manifestVersion": "2023-03-03",
    "paths": [
        {
            "hash": "87cb19095dd5d78fcaf56384ef0e6241",
            "mtime": 1721147454416085,
            "path": "script.sh",
            "size": 39
        }
    ],
    "totalSize": 39
}
```

This indicates that the file `script.sh` was uploaded, and the hash of that file’s contents is `87cb19095dd5d78fcaf56384ef0e6241`. This hash value matches the value in the object name `DeadlineCloud/Data/87cb19095dd5d78fcaf56384ef0e6241.xxh128`. It is used by Deadline Cloud to know which object to download for this file’s contents.

 The full schema for this file is [available in GitHub](https://github.com/aws-deadline/deadline-cloud/blob/mainline/src/deadline/job_attachments/asset_manifests/v2023_03_03/validate.py). 

When you use the [CreateJob operation](https://docs.aws.amazon.com/deadline-cloud/latest/APIReference/API_CreateJob.html) you can set the location of the manifest objects. You can use the [GetJob operation](https://docs.aws.amazon.com/deadline-cloud/latest/APIReference/API_GetJob.html) to see the location: 

```
{
    "attachments": {
        "file system": "COPIED",
        "manifests": [
            {
                "inputManifestHash": "5b0db3d311805ea8de7787b64cbbe8b3",
                "inputManifestPath": "<farm-id>/<queue-id>/Inputs/<guid>/a1d221c7fd97b08175b3872a37428e8c_input",
                "rootPath": "/home/cloudshell-user/job_attachments_devguide",
                "rootPathFormat": "posix"
            }
        ]
    },
    ...
}
```

# How Deadline Cloud chooses the files to upload
<a name="how-job-attachments-decides-what-to-upload-to-amazon-s3"></a>

 The files and directories that job attachments considers for upload to Amazon S3 as inputs to your job are: 
+  The values of all `PATH`-type job parameters defined in the job bundle’s job template with a `dataFlow` value of `IN` or `INOUT`.
+  The files and directories listed as inputs in the job bundle’s asset references file. 

 If you submit a job with no storage profile, all of the files considered for uploading are uploaded. If you submit a job with a storage profile, files are not uploaded to Amazon S3 if they are located in the storage profile’s `SHARED`-type file system locations that are also required file system locations for the queue. These locations are expected to be available on the worker hosts that run the job, so there is no need to upload them to S3. 

 In this example, you create `SHARED` file system locations in `WSAll` in your AWS CloudShell environment and then add files to those file system locations. Use the following command: 

```
# Change the value of WSALL_ID to the identifier of the WSAll storage profile
WSALL_ID=sp-00112233445566778899aabbccddeeff

sudo mkdir -p /shared/common /shared/projects/project1 /shared/projects/project2
sudo chown -R cloudshell-user:cloudshell-user /shared

for d in /shared/common /shared/projects/project1 /shared/projects/project2; do
  echo "File contents for $d" > ${d}/file.txt
done
```

 Next, add an asset references file to the job bundle that includes all the files that you created as inputs for the job. Use the following command: 

```
cat > ${HOME}/job_attachments_devguide/asset_references.yaml << EOF
assetReferences:
  inputs:
    filenames:
    - /shared/common/file.txt
    directories:
    - /shared/projects/project1
    - /shared/projects/project2
EOF
```

 Next, configure the Deadline Cloud CLI to submit jobs with the `WSAll` storage profile, and then submit the job bundle: 

```
# Change the value of FARM_ID to your farm's identifier
FARM_ID=farm-00112233445566778899aabbccddeeff
# Change the value of QUEUE1_ID to queue Q1's identifier
QUEUE1_ID=queue-00112233445566778899aabbccddeeff
# Change the value of WSALL_ID to the identifier of the WSAll storage profile
WSALL_ID=sp-00112233445566778899aabbccddeeff

deadline config set settings.storage_profile_id $WSALL_ID

deadline bundle submit --farm-id $FARM_ID --queue-id $QUEUE1_ID job_attachments_devguide/
```

Deadline Cloud uploads two files to Amazon S3 when you submit the job. You can download the manifest objects for the job from S3 to see the uploaded files: 

```
for manifest in $( \
  aws deadline get-job --farm-id $FARM_ID --queue-id $QUEUE1_ID --job-id $JOB_ID \
    --query 'attachments.manifests[].inputManifestPath' \
    | jq -r '.[]'
); do
  echo "Manifest object: $manifest"
  aws s3 cp --quiet s3://$Q1_S3_BUCKET/DeadlineCloud/Manifests/$manifest /dev/stdout | jq .
done
```

 In this example, there is a single manifest file with the following contents: 

```
{
    "hashAlg": "xxh128",
    "manifestVersion": "2023-03-03",
    "paths": [
        {
            "hash": "87cb19095dd5d78fcaf56384ef0e6241",
            "mtime": 1721147454416085,
            "path": "home/cloudshell-user/job_attachments_devguide/script.sh",
            "size": 39
        },
        {
            "hash": "af5a605a3a4e86ce7be7ac5237b51b79",
            "mtime": 1721163773582362,
            "path": "shared/projects/project2/file.txt",
            "size": 44
        }
    ],
    "totalSize": 83
}
```

 Use the [GetJob operation](https://docs.aws.amazon.com/deadline-cloud/latest/APIReference/API_GetJob.html) for the manifest to see that the `rootPath` is "/". 

```
aws deadline get-job --farm-id $FARM_ID --queue-id $QUEUE1_ID --job-id $JOB_ID --query 'attachments.manifests[*]'
```

 The root path for set of input files is always the longest common subpath of those files. If your job was submitted from Windows instead and there are input files with no common subpath because they were on different drives, you see a separate root path on each drive. The paths in a manifest are always relative to the root path of the manifest, so the input files that were uploaded are: 
+  `/home/cloudshell-user/job_attachments_devguide/script.sh` – The script file in the job bundle. 
+  `/shared/projects/project2/file.txt` – The file in a `SHARED` file system location in the `WSAll` storage profile that is **not** in the list of required file system locations for queue `Q1`. 

The files in file system locations `FSCommon` (`/shared/common/file.txt`) and `FS1` (`/shared/projects/project1/file.txt`) are not in the list. This is because those file system locations are `SHARED` in the `WSAll` storage profile and they both are in the list of required file system locations in queue `Q1`. 

You can see the file system locations considered `SHARED` for a job that is submitted with a particular storage profile with the [GetStorageProfileForQueue operation](https://docs.aws.amazon.com/deadline-cloud/latest/APIReference/API_GetStorageProfileForQueue.html). To query for storage profile `WSAll` for queue `Q1` use the following command: 

```
aws deadline get-storage-profile --farm-id $FARM_ID --storage-profile-id $WSALL_ID

aws deadline get-storage-profile-for-queue --farm-id $FARM_ID --queue-id $QUEUE1_ID --storage-profile-id $WSALL_ID
```

# How jobs find job attachment input files
<a name="how-jobs-find-job-attachments-input-files"></a>

 For a job to use the files that Deadline Cloud uploads to Amazon S3 using job attachments, your job needs those files available through the file system on the worker hosts. When a [session](https://github.com/OpenJobDescription/openjd-specifications/wiki/How-Jobs-Are-Run#sessions) for your job runs on a worker host, Deadline Cloud downloads the input files for the job into a temporary directory on the worker host’s local drive and adds path mapping rules for each of the job’s root paths to its file system location on the local drive. 

 For this example, start the Deadline Cloud worker agent in an AWS CloudShell tab. Let any previously submitted jobs finish running, and then delete the job logs from the logs directory: 

```
rm -rf ~/devdemo-logs/queue-*
```

 The following script modifies the job bundle to show all files in the session’s temporary working directory and the contents of the path mapping rules file, and then submits a job with the modified bundle: 

```
# Change the value of FARM_ID to your farm's identifier
FARM_ID=farm-00112233445566778899aabbccddeeff
# Change the value of QUEUE1_ID to queue Q1's identifier
QUEUE1_ID=queue-00112233445566778899aabbccddeeff
# Change the value of WSALL_ID to the identifier of the WSAll storage profile
WSALL_ID=sp-00112233445566778899aabbccddeeff

deadline config set settings.storage_profile_id $WSALL_ID

cat > ~/job_attachments_devguide/script.sh << EOF
#!/bin/bash

echo "Session working directory is: \$(pwd)"
echo
echo "Contents:"
find . -type f
echo
echo "Path mapping rules file: \$1"
jq . \$1
EOF

cat > ~/job_attachments_devguide/template.yaml << EOF
specificationVersion: jobtemplate-2023-09
name: "Job Attachments Explorer"
parameterDefinitions:
- name: ScriptFile
  type: PATH
  default: script.sh
  dataFlow: IN
  objectType: FILE
steps:
- name: Step
  script:
    actions:
      onRun:
        command: /bin/bash
        args:
        - "{{Param.ScriptFile}}"
        - "{{Session.PathMappingRulesFile}}"
EOF

deadline bundle submit --farm-id $FARM_ID --queue-id $QUEUE1_ID job_attachments_devguide/
```

 You can look at the log of the job’s run after it has been run by the worker in your AWS CloudShell environment: 

```
cat demoenv-logs/queue-*/session*.log
```

The log shows that the first thing that occurs in the session is the two input files for the job are downloaded to the worker: 

```
2024-07-17 01:26:37,824 INFO ==============================================
2024-07-17 01:26:37,825 INFO --------- Job Attachments Download for Job
2024-07-17 01:26:37,825 INFO ==============================================
2024-07-17 01:26:37,825 INFO Syncing inputs using Job Attachments
2024-07-17 01:26:38,116 INFO Downloaded 142.0 B / 186.0 B of 2 files (Transfer rate: 0.0 B/s)
2024-07-17 01:26:38,174 INFO Downloaded 186.0 B / 186.0 B of 2 files (Transfer rate: 733.0 B/s)
2024-07-17 01:26:38,176 INFO Summary Statistics for file downloads:
Processed 2 files totaling 186.0 B.
Skipped re-processing 0 files totaling 0.0 B.
Total processing time of 0.09752 seconds at 1.91 KB/s.
```

 Next is the output from `script.sh` run by the job: 
+  The input files uploaded when the job was submitted are located under a directory whose name begins with "assetroot" in the session’s temporary directory. 
+  The input files’ paths have been relocated relative to the "assetroot" directory instead of relative to the root path for the job’s input manifest (`"/"`).
+  The path mapping rules file contains an additional rule that remaps `"/"` to the absolute path of the "assetroot" directory. 

 For example: 

```
2024-07-17 01:26:38,264 INFO Output:
2024-07-17 01:26:38,267 INFO Session working directory is: /sessions/session-5b33f
2024-07-17 01:26:38,267 INFO 
2024-07-17 01:26:38,267 INFO Contents:
2024-07-17 01:26:38,269 INFO ./tmp_xdhbsdo.sh
2024-07-17 01:26:38,269 INFO ./tmpdi00052b.json
2024-07-17 01:26:38,269 INFO ./assetroot-assetroot-3751a/shared/projects/project2/file.txt
2024-07-17 01:26:38,269 INFO ./assetroot-assetroot-3751a/home/cloudshell-user/job_attachments_devguide/script.sh
2024-07-17 01:26:38,269 INFO 
2024-07-17 01:26:38,270 INFO Path mapping rules file: /sessions/session-5b33f/tmpdi00052b.json
2024-07-17 01:26:38,282 INFO {
2024-07-17 01:26:38,282 INFO   "version": "pathmapping-1.0",
2024-07-17 01:26:38,282 INFO   "path_mapping_rules": [
2024-07-17 01:26:38,282 INFO     {
2024-07-17 01:26:38,282 INFO       "source_path_format": "POSIX",
2024-07-17 01:26:38,282 INFO       "source_path": "/shared/projects/project1",
2024-07-17 01:26:38,283 INFO       "destination_path": "/mnt/projects/project1"
2024-07-17 01:26:38,283 INFO     },
2024-07-17 01:26:38,283 INFO     {
2024-07-17 01:26:38,283 INFO       "source_path_format": "POSIX",
2024-07-17 01:26:38,283 INFO       "source_path": "/shared/common",
2024-07-17 01:26:38,283 INFO       "destination_path": "/mnt/common"
2024-07-17 01:26:38,283 INFO     },
2024-07-17 01:26:38,283 INFO     {
2024-07-17 01:26:38,283 INFO       "source_path_format": "POSIX",
2024-07-17 01:26:38,283 INFO       "source_path": "/",
2024-07-17 01:26:38,283 INFO       "destination_path": "/sessions/session-5b33f/assetroot-assetroot-3751a"
2024-07-17 01:26:38,283 INFO     }
2024-07-17 01:26:38,283 INFO   ]
2024-07-17 01:26:38,283 INFO }
```

**Note**  
 If the job you submit has multiple manifests with different root paths, there is a different "assetroot"-named directory for each of the root paths. 

 If you need to reference the relocated file system location of one of your input files, directories, or file system locations you can either process the path mapping rules file in your job and perform the remapping yourself, or add a `PATH` type job parameter to the job template in your job bundle and pass the value that you need to remap as the value of that parameter. For example, the following example modifies the job bundle to have one of these job parameters and then submits a job with the file system location `/shared/projects/project2` as its value: 

```
cat > ~/job_attachments_devguide/template.yaml << EOF
specificationVersion: jobtemplate-2023-09
name: "Job Attachments Explorer"
parameterDefinitions:
- name: LocationToRemap
  type: PATH
steps:
- name: Step
  script:
    actions:
      onRun:
        command: /bin/echo
        args:
        - "The location of {{RawParam.LocationToRemap}} in the session is {{Param.LocationToRemap}}"
EOF

deadline bundle submit --farm-id $FARM_ID --queue-id $QUEUE1_ID job_attachments_devguide/ \
  -p LocationToRemap=/shared/projects/project2
```

 The log file for this job’s run contains its output: 

```
2024-07-17 01:40:35,283 INFO Output:
2024-07-17 01:40:35,284 INFO The location of /shared/projects/project2 in the session is /sessions/session-5b33f/assetroot-assetroot-3751a
```

# Getting output files from a job
<a name="getting-output-files-from-a-job"></a>

This example shows how Deadline Cloud identifies the output files that your jobs generate, decides whether to upload those files to Amazon S3, and how you can get those output files on your workstation. 

 Use the `job_attachments_devguide_output` job bundle instead of the `job_attachments_devguide` job bundle for this example. Start by making a copy of the bundle in your AWS CloudShell environment from your clone of the Deadline Cloud samples GitHub repository: 

```
cp -r deadline-cloud-samples/job_bundles/job_attachments_devguide_output ~/
```

 The important difference between this job bundle and the `job_attachments_devguide` job bundle is the addition of a new job parameter in the job template: 

```
...
parameterDefinitions:
...
- name: OutputDir
  type: PATH
  objectType: DIRECTORY
  dataFlow: OUT
  default: ./output_dir
  description: This directory contains the output for all steps.
...
```

 The `dataFlow` property of the parameter has the value `OUT`. Deadline Cloud uses the value of `dataFlow` job parameters with a value of `OUT` or `INOUT` as outputs of your job. If the file system location passed as a value to these kinds of job parameters is remapped to a local file system location on the worker that runs the job, then Deadline Cloud will look for new files at the location and upload those to Amazon S3 as job outputs. 

 To see how this works, first start the Deadline Cloud worker agent in an AWS CloudShell tab. Let any previously submitted jobs finish running. Then delete the job logs from the logs directory: 

```
rm -rf ~/devdemo-logs/queue-*
```

 Next, submit a job with this job bundle. After the worker running in your CloudShell runs, look at the logs: 

```
# Change the value of FARM_ID to your farm's identifier
FARM_ID=farm-00112233445566778899aabbccddeeff
# Change the value of QUEUE1_ID to queue Q1's identifier
QUEUE1_ID=queue-00112233445566778899aabbccddeeff
# Change the value of WSALL_ID to the identifier of the WSAll storage profile
WSALL_ID=sp-00112233445566778899aabbccddeeff

deadline config set settings.storage_profile_id $WSALL_ID

deadline bundle submit --farm-id $FARM_ID --queue-id $QUEUE1_ID ./job_attachments_devguide_output
```

 The log shows that a file was detected as output and uploaded to Amazon S3: 

```
2024-07-17 02:13:10,873 INFO ----------------------------------------------
2024-07-17 02:13:10,873 INFO Uploading output files to Job Attachments
2024-07-17 02:13:10,873 INFO ----------------------------------------------
2024-07-17 02:13:10,873 INFO Started syncing outputs using Job Attachments
2024-07-17 02:13:10,955 INFO Found 1 file totaling 117.0 B in output directory: /sessions/session-7efa/assetroot-assetroot-3751a/output_dir
2024-07-17 02:13:10,956 INFO Uploading output manifest to DeadlineCloud/Manifests/farm-0011/queue-2233/job-4455/step-6677/task-6677-0/2024-07-17T02:13:10.835545Z_sessionaction-8899-1/c6808439dfc59f86763aff5b07b9a76c_output
2024-07-17 02:13:10,988 INFO Uploading 1 output file to S3: s3BucketName/DeadlineCloud/Data
2024-07-17 02:13:11,011 INFO Uploaded 117.0 B / 117.0 B of 1 file (Transfer rate: 0.0 B/s)
2024-07-17 02:13:11,011 INFO Summary Statistics for file uploads:
Processed 1 file totaling 117.0 B.
Skipped re-processing 0 files totaling 0.0 B.
Total processing time of 0.02281 seconds at 5.13 KB/s.
```

 The log also shows that Deadline Cloud created a new manifest object in the Amazon S3 bucket configured for use by job attachments on queue `Q1`. The name of the manifest object is derived from the farm, queue, job, step, task, timestamp, and `sessionaction` identifiers of the task that generated the output. Download this manifest file to see where Deadline Cloud placed the output files for this task: 

```
# The name of queue `Q1`'s job attachments S3 bucket
Q1_S3_BUCKET=$(
  aws deadline get-queue --farm-id $FARM_ID --queue-id $QUEUE1_ID \
    --query 'jobAttachmentSettings.s3BucketName' | tr -d '"'
)

# Fill this in with the object name from your log
OBJECT_KEY="DeadlineCloud/Manifests/..."

aws s3 cp --quiet s3://$Q1_S3_BUCKET/$OBJECT_KEY /dev/stdout | jq .
```

 The manifest looks like: 

```
{
  "hashAlg": "xxh128",
  "manifestVersion": "2023-03-03",
  "paths": [
    {
      "hash": "34178940e1ef9956db8ea7f7c97ed842",
      "mtime": 1721182390859777,
      "path": "output_dir/output.txt",
      "size": 117
    }
  ],
  "totalSize": 117
}
```

 This shows that the content of the output file is saved to Amazon S3 the same way that job input files are saved. Similar to input files, the output file is stored in S3 with an object name containing the hash of the file and the prefix `DeadlineCloud/Data`. 

```
$ aws s3 ls --recursive s3://$Q1_S3_BUCKET | grep 34178940e1ef9956db8ea7f7c97ed842
2024-07-17 02:13:11        117 DeadlineCloud/Data/34178940e1ef9956db8ea7f7c97ed842.xxh128
```

 You can download the output of a job to your workstation using the Deadline Cloud monitor or the Deadline Cloud CLI: 

```
deadline job download-output --farm-id $FARM_ID --queue-id $QUEUE1_ID --job-id $JOB_ID
```

 The value of the `OutputDir` job parameter in the submitted job is `./output_dir`, so the output are downloaded to a directory called `output_dir` within the job bundle directory. If you specified an absolute path or different relative location as the value for `OutputDir`, then the output files would be downloaded to that location instead. 

```
$ deadline job download-output --farm-id $FARM_ID --queue-id $QUEUE1_ID --job-id $JOB_ID
Downloading output from Job 'Job Attachments Explorer: Output'

Summary of files to download:
    /home/cloudshell-user/job_attachments_devguide_output/output_dir/output.txt (1 file)

You are about to download files which may come from multiple root directories. Here are a list of the current root directories:
[0] /home/cloudshell-user/job_attachments_devguide_output
> Please enter the index of root directory to edit, y to proceed without changes, or n to cancel the download (0, y, n) [y]: 

Downloading Outputs  [####################################]  100%
Download Summary:
    Downloaded 1 files totaling 117.0 B.
    Total download time of 0.14189 seconds at 824.0 B/s.
    Download locations (total file counts):
        /home/cloudshell-user/job_attachments_devguide_output (1 file)
```

# Using files from a step in a dependent step
<a name="using-files-output-from-a-step-in-a-dependent-step"></a>

This example shows how one step in a job can access the outputs from a step that it depends on in the same job. 

 To make the outputs of one step available to another, Deadline Cloud adds additional actions to a session to download those outputs before running tasks in the session. You tell it which steps to download the outputs from by declaring those steps as dependencies of the step that needs to use the outputs. 

Use the `job_attachments_devguide_output` job bundle for this example. Start by making a copy in your AWS CloudShell environment from your clone of the Deadline Cloud samples GitHub repository. Modify it to add a dependent step that only runs after the existing step and uses that step’s output: 

```
cp -r deadline-cloud-samples/job_bundles/job_attachments_devguide_output ~/

cat >> job_attachments_devguide_output/template.yaml << EOF
- name: DependentStep
  dependencies:
  - dependsOn: Step
  script:
    actions:
      onRun:
        command: /bin/cat
        args:
        - "{{Param.OutputDir}}/output.txt"
EOF
```

 The job created with this modified job bundle runs as two separate sessions, one for the task in the step "Step" and then a second for the task in the step "DependentStep". 

First start the Deadline Cloud worker agent in an CloudShell tab. Let any previously submitted jobs finish running, then delete the job logs from the logs directory: 

```
rm -rf ~/devdemo-logs/queue-*
```

 Next, submit a job using the modified `job_attachments_devguide_output` job bundle. Wait for it to finish running on the worker in your CloudShell environment. Look at the logs for the two sessions: 

```
# Change the value of FARM_ID to your farm's identifier
FARM_ID=farm-00112233445566778899aabbccddeeff
# Change the value of QUEUE1_ID to queue Q1's identifier
QUEUE1_ID=queue-00112233445566778899aabbccddeeff
# Change the value of WSALL_ID to the identifier of the WSAll storage profile
WSALL_ID=sp-00112233445566778899aabbccddeeff

deadline config set settings.storage_profile_id $WSALL_ID

deadline bundle submit --farm-id $FARM_ID --queue-id $QUEUE1_ID ./job_attachments_devguide_output

# Wait for the job to finish running, and then:

cat demoenv-logs/queue-*/session-*
```

 In the session log for the task in the step named `DependentStep`, there are two separate download actions run: 

```
2024-07-17 02:52:05,666 INFO ==============================================
2024-07-17 02:52:05,666 INFO --------- Job Attachments Download for Job
2024-07-17 02:52:05,667 INFO ==============================================
2024-07-17 02:52:05,667 INFO Syncing inputs using Job Attachments
2024-07-17 02:52:05,928 INFO Downloaded 207.0 B / 207.0 B of 1 file (Transfer rate: 0.0 B/s)
2024-07-17 02:52:05,929 INFO Summary Statistics for file downloads:
Processed 1 file totaling 207.0 B.
Skipped re-processing 0 files totaling 0.0 B.
Total processing time of 0.03954 seconds at 5.23 KB/s.

2024-07-17 02:52:05,979 INFO 
2024-07-17 02:52:05,979 INFO ==============================================
2024-07-17 02:52:05,979 INFO --------- Job Attachments Download for Step
2024-07-17 02:52:05,979 INFO ==============================================
2024-07-17 02:52:05,980 INFO Syncing inputs using Job Attachments
2024-07-17 02:52:06,133 INFO Downloaded 117.0 B / 117.0 B of 1 file (Transfer rate: 0.0 B/s)
2024-07-17 02:52:06,134 INFO Summary Statistics for file downloads:
Processed 1 file totaling 117.0 B.
Skipped re-processing 0 files totaling 0.0 B.
Total processing time of 0.03227 seconds at 3.62 KB/s.
```

 The first action downloads the `script.sh` file used by the step named "Step." The second action downloads the outputs from that step. Deadline Cloud determines which files to download by using the output manifest generated by that step as an input manifest. 

 Late in the same log, you can see the output from the step named "DependentStep": 

```
2024-07-17 02:52:06,213 INFO Output:
2024-07-17 02:52:06,216 INFO Script location: /sessions/session-5b33f/assetroot-assetroot-3751a/script.sh
```