Schedule jobs in Deadline Cloud
After you create a job, AWS Deadline Cloud schedules it to be processed on one or more of the fleets associated with a queue. The fleet that processes a particular task is chosen based on the scheduling configuration, the capabilities configured for the fleet, and the host requirements of a specific step.
The following sections provide details of the process of scheduling a job.
Scheduling configurations
You can configure how Deadline Cloud schedules jobs in a queue by setting a scheduling configuration on the queue. The scheduling configuration controls how workers are distributed across jobs.
You can set the scheduling configuration using the Deadline Cloud console or by calling the CreateQueue or UpdateQueue APIs.
There are three available scheduling configurations:
-
Priority, first-in-first-out (
priorityFifo) – Schedules the highest priority, earliest submitted job first (default). -
Priority, balanced (
priorityBalanced) – Distributes workers evenly across jobs at the highest priority. -
Weighted, balanced (
weightedBalanced) – Uses a weighted formula to determine how workers are distributed across jobs.
In all scheduling configurations, in-progress tasks run to completion before a new scheduling decision is made. If you change the scheduling configuration while tasks are running, the change applies only when workers are assigned next. Running tasks are not interrupted or reassigned.
Priority, first-in-first-out
Priority, first-in-first-out (priorityFifo) is the default scheduling configuration
for new queues. Deadline Cloud assigns workers to the highest-priority job first. When multiple
jobs share the same priority, the oldest (earliest submitted) job receives all available
workers first.
Use priority FIFO when you want strict ordering of jobs. This configuration is appropriate when jobs should complete one at a time in the order they were submitted, such as sequential pipeline stages or batch processing where each job must finish before the next one starts.
This configuration has no additional parameters.
Priority, balanced
Priority, balanced (priorityBalanced) distributes workers evenly across
all jobs at the highest priority level. When only one job exists at the highest priority,
Deadline Cloud assigns all workers to that job. When multiple jobs share the highest priority,
workers are split evenly among them. If the workers cannot be evenly divided,
the extra workers are distributed among the highest priority jobs.
Use priority balanced when multiple artists or users submit jobs at the same priority and each user needs immediate feedback. This configuration ensures that no single job monopolizes all available workers, so that all users are allocated workers shortly after submission.
If a job has fewer remaining tasks than its share of workers, the surplus workers are redistributed to other jobs at the same priority level. If all jobs at the highest priority are fully allocated, surplus workers cascade to jobs at the next highest priority level.
This configuration has the following parameter:
renderingTaskBuffer-
Controls worker stickiness. A worker switches from its current job to another job at the same priority only if the difference in rendering tasks exceeds the
renderingTaskBuffervalue. A higher value keeps workers on their current jobs longer, reducing context switching. The default value is1.
Weighted, balanced
Weighted, balanced (weightedBalanced) uses a formula to calculate a
weight for each job. Deadline Cloud assigns workers to the highest-weight job first. If multiple
jobs have the same weight, workers are distributed among them.
Use weighted balanced when you need fine-grained control over how workers are distributed across jobs with varying priorities, error rates, and submission times. This configuration is appropriate for complex render farm environments where you want to tune the balance between job priority, job age, error handling, and worker stickiness.
The weight for each job is calculated as follows:
weight = (job.Priority * priorityWeight) +
(job.Errors * errorWeight) +
((currentTimeInSeconds - job.SubmissionTime) * submissionTimeWeight) +
((job.RenderingTasks - renderingTaskBuffer) * renderingTaskWeight)
The renderingTaskBuffer component is applied only if the worker is
currently working on the job. The renderingTaskWeight is usually set
to a negative value so that jobs with assigned workers receive a lower weight,
bringing other jobs to the front of the queue. The
errorWeight is also usually negative so that jobs with errors are
deprioritized. You can use scheduling overrides for minimum and maximum priority
jobs.
This configuration has the following parameters:
priorityWeight-
The weight applied to a job's priority. A positive value means higher-priority jobs are scheduled first. The default value is
100.0. Range:0to10000. errorWeight-
The weight applied to a job's error count. A negative value means jobs without errors are scheduled first. The default value is
-10.0. Range:-10000to10000. submissionTimeWeight-
The weight applied to a job's submission time (in seconds). A positive value means earlier submitted jobs are scheduled first. The default value is
3.0. Range:0to10000. renderingTaskWeight-
The weight applied to the number of tasks currently rendering for a job. A negative value means jobs with fewer workers are scheduled next. The default value is
-100.0. Range:-10000to10000. renderingTaskBuffer-
The number of rendering tasks before the rendering task weight takes effect. A positive value keeps workers on their current jobs. The default value is
1. Range:0to1000. maxPriorityOverride-
Optional. When set to
alwaysScheduleFirst, jobs at the maximum priority (100) are always scheduled before other jobs, regardless of the weighted formula. When multiple jobs have the maximum priority, ties are broken using the standard weighted formula. When the override is absent, maximum priority jobs use the standard weighted formula with no special treatment. minPriorityOverride-
Optional. When set to
alwaysScheduleLast, jobs at the minimum priority (0) are always scheduled after other jobs, regardless of the weighted formula. When multiple jobs have the minimum priority, ties are broken using the standard weighted formula. When the override is absent, minimum priority jobs use the standard weighted formula with no special treatment.
Determine fleet compatibility
After you create a job, Deadline Cloud checks the host requirements for each step in the job
against the capabilities of the fleets associated with the queue the job was submitted to.
If a fleet meets the host requirements, the job is put into the READY
state.
If any step in the job has requirements that can't be met by a fleet associated with the
queue, the step's status is set to NOT_COMPATIBLE. In addition, the rest of the
steps in the job are canceled.
Capabilities for a fleet are set at the fleet level. Even if a worker in a fleet meets the job's requirements, it won't be assigned tasks from the job if its fleet doesn't meet the job's requirements.
The following job template has a step that specifies host requirements for the step:
name: Sample Job With Host Requirements specificationVersion: jobtemplate-2023-09 steps: - name: Step 1 script: actions: onRun: args: - '1' command: /usr/bin/sleep hostRequirements: amounts: # Capabilities starting with "amount." are amount capabilities. If they start with "amount.worker.", # they are defined by the OpenJD specification. Other names are free for custom usage. - name: amount.worker.vcpu min: 4 max: 8 attributes: - name: attr.worker.os.family anyOf: - linux
This job can be scheduled to a fleet with the following capabilities:
{
"vCpuCount": {"min": 4, "max": 8},
"memoryMiB": {"min": 1024},
"osFamily": "linux",
"cpuArchitectureType": "x86_64"
}
This job can't be scheduled to a fleet with any of the following capabilities:
{
"vCpuCount": {"min": 4},
"memoryMiB": {"min": 1024},
"osFamily": "linux",
"cpuArchitectureType": "x86_64"
}
The vCpuCount has no maximum, so it exceeds the maximum vCPU host requirement.
{
"vCpuCount": {"max": 8},
"memoryMiB": {"min": 1024},
"osFamily": "linux",
"cpuArchitectureType": "x86_64"
}
The vCpuCount has no minimum, so it doesn't satisfy the minimum vCPU host requirement.
{
"vCpuCount": {"min": 4, "max": 8},
"memoryMiB": {"min": 1024},
"osFamily": "windows",
"cpuArchitectureType": "x86_64"
}
The osFamily doesn't match.
Fleet scaling
When a job is assigned to a compatible service-managed fleet, the fleet is auto scaled. The number of workers in the fleet changes based on the number of tasks available for the fleet to run.
When a job is assigned to a customer-managed fleet, workers might already exist or can be created using event-based auto scaling. For more information, see Use EventBridge to handle auto scaling events in the Amazon EC2 Auto Scaling User Guide.
Sessions
The tasks in a job are divided into one or more sessions. Workers run the sessions to set up the environment, run the tasks, and then tear down the environment. Each session is composed of one or more actions that a worker must take.
As a worker completes section actions, additional session actions can be sent to the worker. The worker reuses existing environments and job attachments in the session to complete tasks more efficiently.
On service-managed fleet workers, session directories are deleted after the session ends,
but other directories are retained between sessions. This behavior allows you to implement caching strategies
for data that can be reused across multiple sessions. To cache data between sessions, store it
under the home directory of the user running the job. For example, conda packages are cached under
the job user's home directory at C:\Users\job-user\.conda-pkgs on Windows workers and
/home/job-user/.conda-pkgs on Linux workers. This data remains available until the
worker shuts down.
Job attachments are created by the submitter that you use as part of your Deadline Cloud CLI job
bundle. You can also create job attachments using the --attachments option for
the create-job AWS CLI command. Environments are defined in two places: queue
environments attached to a specific queue, and job and step environments defined in the job
template.
There are four session action types:
-
syncInputJobAttachments– Downloads the input job attachments to the worker. -
envEnter– Performs theonEnteractions for an environment. -
taskRun– Performs theonRunactions for a task. -
envExit– Performs theonExitactions for an environment.
The following job template has a step environment. It has an onEnter
definition to set up the step environment, an onRun definition that defines the
task to run, and an onExit definition to tear down the step environment. The
sessions created for this job will include an envEnter action, one or more
taskRun actions, and then an envExit action.
name: Sample Job with Maya Environment specificationVersion: jobtemplate-2023-09 steps: - name: Maya Step stepEnvironments: - name: Maya description: Runs Maya in the background. script: embeddedFiles: - name: initData filename: init-data.yaml type: TEXT data: | scene_file: MyAwesomeSceneFile renderer: arnold camera: persp actions: onEnter: command: MayaAdaptor args: - daemon - start - --init-data - file://{{Env.File.initData}} onExit: command: MayaAdaptor args: - daemon - stop parameterSpace: taskParameterDefinitions: - name: Frame range: 1-5 type: INT script: embeddedFiles: - name: runData filename: run-data.yaml type: TEXT data: | frame: {{Task.Param.Frame}} actions: onRun: command: MayaAdaptor args: - daemon - run - --run-data - file://{{ Task.File.runData }}
Session actions pipelining
Session actions pipelining lets a scheduler pre-assign multiple session actions to a worker. The worker can then run these actions sequentially, reducing or eliminating idle time between tasks.
To create an initial assignment, the scheduler creates a session with one task, the worker completes the task, and then the scheduler analyzes the task duration to determine future assignments.
For the scheduler to be effective, there are task duration rules. For tasks under one minute, the scheduler uses a power-of-2 growth pattern. For example, for a 1-second task, the scheduler assigns 2 new tasks, then 4, then 8. For tasks over one minute, the scheduler assigns only one new task and pipelining remains disabled.
To calculate pipeline size, the scheduler does the following:
-
Uses average task duration from completed tasks
-
Aims to keep the worker busy for one minute
-
Considers only tasks within the same session
-
Does not share duration data across workers
With session actions piplelining, workers start new tasks immediately and there's no waiting time between scheduler requests. It also provides improved worker efficiency and better task distribution for long-running processes.
Additionally, if there is a new higher priority job available, the worker will finish all of its previously assigned work before its current session ends and a new session from a higher priority job is assigned.
Step dependencies
Deadline Cloud supports defining dependencies between steps so that one step waits until another step is complete before starting. You can define more than one dependency for a step. A step with a dependency isn't scheduled until all of its dependencies are complete.
If the job template defines a circular dependency, the job is rejected and the job
status is set to CREATE_FAILED.
The following job template creates a job with two steps. StepB depends on
StepA. StepB only runs after StepA completes
successfully.
After the job is created, StepA is in the READY state and
StepB is in the PENDING state. After StepA
finishes, StepB moves to the READY state. If StepA
fails, or if StepA is canceled, StepB moves to the
CANCELED state.
You can set a dependency on multiple steps. For example, if StepC depends
on both StepA and StepB, StepC won't start until the
other two steps finish.
Step dependencies have the following restrictions:
-
Dependencies per step – A step can depend on a maximum of 128 other steps.
-
Consumers per step – A maximum of 32 other steps can depend on a single step.
name: Step-Step Dependency Test specificationVersion: 'jobtemplate-2023-09' steps: - name: A script: actions: onRun: command: bash args: ['{{ Task.File.run }}'] embeddedFiles: - name: run type: TEXT data: | #!/bin/env bash set -euo pipefail sleep 1 echo Task A Done! - name: B dependencies: - dependsOn: A # This means Step B depends on Step A script: actions: onRun: command: bash args: ['{{ Task.File.run }}'] embeddedFiles: - name: run type: TEXT data: | #!/bin/env bash set -euo pipefail sleep 1 echo Task B Done!