YAML file reference¶
The batchbeagle service config file is a YAML file defining Batch Queues, Compute Environments, and Job Descriptions. The default path for a batchbeagle configuration file is ./batchbeagle.yml
.
There are currently three main sections in the batchbeagle.yml
file, Queues
, Compute Environments
and Job Definitions
.
Queues¶
Queues are specified in a YAML list under the top level queues:
key like
so:
queues:
- name: foobar-prod
...
- name: foobar-test
...
name¶
(String, Required) The name of the Batch queue in AWS. name
is required. There are restrictions on characters: Up to 255 letters (uppercase and
lowercase), numbers, hyphens, and underscores are allowed.
Once your queue has been created, this is not changable without deleting and re-creating the queue.
queues:
- name: foobar-prod
state¶
(Enum, Required) The state of the job queue. If the job queue state is enabled, it is able to accept jobs. Valid values are enabled
or disabled
.
queues:
- name: foobar-prod
state: enabled
priority¶
(Integer, Required) The priority of the job queue. Job queues with a higher priority (or a higher integer value for the priority parameter) are evaluated first when associated with same compute environment. Priority is determined in descending order, for example, a job queue with a priority value of 10 is given scheduling preference over a job queue with a priority value of 1.
queues:
- name: foobar-prod
state: enabled
priority: 1
compute_environments¶
(List, Required) The set of compute environments mapped to a job queue and their order relative to each other. The job scheduler uses this parameter to determine which compute environment should execute a given job. Compute environments must be in the VALID state before you can associate them with a job queue. You can associate up to 3 compute environments with a job queue. You must specify both the compute environment name and the order.
queues:
- name: foobar-prod
state: enabled
priority: 1
compute_environments:
- name: foobar-env
order: 1
name¶
(String, Required) The name of the compute environment as named in the compute_environments
section.
order¶
(Integer, Required) The order of the compute environment.
Compute Environments¶
Compute Environments are specified in a YAML list under the top level compute_environments:
key like so:
compute_environments:
- name: foobarenv-prod
...
- name: foobarenv-test
...
name¶
(String, Required) The name of the Batch compute environment in AWS. name
is required. There are restrictions on characters: Up to 255 letters (uppercase and
lowercase), numbers, hyphens, and underscores are allowed.
Once your compute environment has been created, this is not changable without deleting and re-creating the compute environment.
compute_environments:
- name: foobarenv-prod
state¶
(Enum, Required) The state of the job queue. If the job queue state is enabled, it is able to accept jobs. Valid values are enabled
or disabled
.
compute_environments:
- name: foobarenv-prod
state: enabled
type¶
(Enum, Required) The type of the compute environment. Valid values are managed
or unmanaged
.
compute_environments:
- name: foobarenv-prod
state: enabled
type: managed
serviceRole¶
(String, Required) The full Amazon Resource Name (ARN) of the IAM role that allows AWS Batch to make calls to other AWS services on your behalf.
compute_environments:
- name: foobarenv-prod
state: enabled
type: managed
serviceRole: arn:aws:iam::12345678901:role/service-role/AWSBatchServiceRole
compute_resources¶
Details of the compute resources managed by the compute environment. This parameter is required for managed compute environments.
compute_environments:
- name: foobarenv-prod
state: enabled
type: managed
serviceRole: arn:aws:iam::12345678901:role/service-role/AWSBatchServiceRole
compute_resources:
type: ec2
instanceRole: arn:aws:iam::12345678901:instance-profile/prodbatchrole
instanceTypes:
- optimal
maxvCpus: 48
minvCpus: 0
securityGroupIds:
- sg-fe1ff599
subnets:
- subnet-9f03a2c7
When using Spot instances, you might have something like this:
compute_environments:
- name: foobarenv-prod
state: enabled
type: managed
serviceRole: arn:aws:iam::12345678901:role/service-role/AWSBatchServiceRole
compute_resources:
type: spot
instanceRole: arn:aws:iam::12345678901:instance-profile/prodbatchrole
instanceTypes:
- optimal
maxvCpus: 48
minvCpus: 0
desiredvCpus: 0
imageId: foobar
ec2KeyPair: mykey.pem
securityGroupIds:
- sg-fefefefe
subnets:
- subnet-9f9f9f9f
bidPercentage: 50
spotIamFleetRole: arn:aws:iam::12345678901:role/aws-ec2-spot-fleet-role
type¶
(Enum, Required) The type of compute environment. Valid values are ec2
or spot
.
compute_environments:
- name: foobarenv-prod
state: enabled
type: managed
serviceRole: arn:aws:iam::12345678901:role/service-role/AWSBatchServiceRole
compute_resources:
type: ec2
instanceRole¶
(String, Required) The Amazon ECS instance profile applied to Amazon EC2 instances in a compute environment. You can specify the short name or full Amazon Resource Name (ARN) of an instance profile. For example, ecsInstanceRole or arn:aws:iam::<aws_account_id>:instance-profile/ecsInstanceRole. For more information, see Amazon ECS Instance Role in the AWS Batch User Guide.
compute_environments:
- name: foobarenv-prod
state: enabled
type: managed
serviceRole: arn:aws:iam::12345678901:role/service-role/AWSBatchServiceRole
compute_resources:
type: ec2
instanceRole: arn:aws:iam::12345678901:instance-profile/prodbatchrole
instanceTypes¶
(List, Required) The instances types that may launched.
compute_environments:
- name: foobarenv-prod
state: enabled
type: managed
serviceRole: arn:aws:iam::12345678901:role/service-role/AWSBatchServiceRole
compute_resources:
type: ec2
instanceRole: arn:aws:iam::12345678901:instance-profile/prodbatchrole
instanceTypes:
- optimal
maxvCpus¶
(Integer, Required) The maximum number of EC2 vCPUs that an environment can reach.
minvCpus¶
(Integer, Required) The minimum number of EC2 vCPUs that an environment should maintain.
desiredvCpus¶
(Integer, Optional) The desired number of EC2 vCPUS in the compute environment.
securityGroupIds¶
(List, Required) The EC2 security groups that are associated with instances launched in the compute environment.
subnets¶
(List, Required) The VPC subnets into which the compute resources are launched.
tags¶
(Dict, Optional) Key-value pair tags to be applied to resources that are launched in the compute environment.
ec2KeyPair¶
(String, Optional) The EC2 key pair that is used for instances launched in the compute environment.
imageId¶
(String, Optional) The Amazon Machine Image (AMI) ID used for instances launched in the compute environment.
spotIamFleetRole¶
(String, Optional) The Amazon Resource Name (ARN) of the Amazon EC2 Spot Fleet IAM role applied to a SPOT compute environment.
bidPercentage¶
(Integer, Optional) The minimum percentage that a Spot Instance price must be when compared with the On-Demand price for that instance type before instances are launched. For example, if your bid percentage is 20%, then the Spot price must be below 20% of the current On-Demand price for that EC2 instance.
Job Definitions¶
Job Definitions are specified in a YAML list under the top level job_definitions:
key like so:
job_definitions:
- name: job1
...
- name: job2
...
name¶
(String, Required) The name of the Batch job definition in AWS. name
is required. There are restrictions on characters: Up to 255 letters (uppercase and lowercase), numbers, hyphens, and underscores are allowed.
job_definitions:
- name: job1
parameters¶
(Dict, Optional) Default parameter substitution placeholders to set in the job definition. Parameters are specified as a key-value pair mapping. Parameters defined when submitting a job override any corresponding parameter defaults from the job definition.
job_definitions:
- name: job1
parameters:
greeting: hello
greetee: world
retryStrategy¶
The retry strategy to use for failed jobs that are submitted with this job definition.
job_definitions:
- name: job1
retryStrategy:
attempts: 1
attempts¶
(Integer, Optional) The number of times to move a job to the RUNNABLE status. You may specify between 1 and 10 attempts. If attempts is greater than one, the job is retried if it fails until it has moved to RUNNABLE that many times.
timeout¶
You can configure a timeout duration for your jobs so that if a job runs longer than that, AWS Batch terminates the job.
job_definitions:
- name: job1
timeout:
attemptDurationSeconds: 300
attemptDurationSeconds¶
(Integer, Optional) The time duration in seconds after which AWS Batch terminates your jobs if they have not finished. The minimum value for the timeout is 60 seconds.
container¶
Container properties are used in job definitions to describe the container that is launched as part of a job.
job_definitions:
- name: job1
container:
image: centos
memory: 128
vcpus: 1
command: echo nope
jobRoleArn: arn:aws:iam::12345678901:...
user: glenn
privileged: True
volumes:
- name: foo
host:
sourcePath: bar
- name: bar
environment:
- name: X
value: 1
- name: Y
value: 2
mountPoints:
- containerPath: foo1
readOnly: False
sourceVolume: bar1
- containerPath: foo2
readOnly: True
sourceVolume: bar2
ulimits:
- name: foo
hardLimit: 15
softLimit: 7
- name: bar
hardLimit: 25
softLimit: 17
command¶
(String, Optional) The command that is passed to the container. This parameter maps to Cmd in the Create a container section of the Docker Remote API and the COMMAND parameter to docker run. For more information, see the Docker Reference
environment¶
(Dict, Optional )The environment variables to pass to a container. This parameter maps to Env in the Create a container section of the Docker Remote API and the –env option to docker run.
Important - We do not recommend using plain text environment variables for sensitive information, such as credential data.
Note - Environment variables must not start with AWS_BATCH; this naming convention is reserved for variables that are set by the AWS Batch service.
image¶
(String, Required) The image used to start a container. This string is passed directly to the Docker daemon. Images in the Docker Hub registry are available by default. Other repositories are specified with repository-url/image:tag . Up to 255 letters (uppercase and lowercase), numbers, hyphens, underscores, colons, periods, forward slashes, and number signs are allowed. This parameter maps to Image in the Create a container section of the Docker Remote API and the IMAGE parameter of docker run. Images in Amazon ECR repositories use the full registry and repository URI (for example, 012345678910.dkr.ecr.<region-name>.amazonaws.com/<repository-name>).
jobRoleArn¶
(String, Optional) The Amazon Resource Name (ARN) of the IAM role that the container can assume for AWS permissions.
memory¶
(Integer, Required) The hard limit (in MiB) of memory to present to the container. If your container attempts to exceed the memory specified here, the container is killed. This parameter maps to Memory in the Create a container section of the Docker Remote API and the –memory option to docker run. You must specify at least 4 MiB of memory for a job.
privileged¶
(Boolean, Optional) When this parameter is True, the container is given elevated privileges on the host container instance (similar to the root user). This parameter maps to Privileged in the Create a container section of the Docker Remote API and the –privileged option to docker run.
readonlyRootFilesystem¶
(Boolean, Optional) When this parameter is true, the container is given read-only access to its root file system. This parameter maps to ReadonlyRootfs in the Create a container section of the Docker Remote API and the –read-only option to docker run.
user¶
(String, Optional) The user name to use inside the container. This parameter maps to User in the Create a container section of the Docker Remote API and the –user option to docker run.
vcpus¶
(Integer, Required) The number of vCPUs reserved for the container. This parameter maps to CpuShares in the Create a container section of the Docker Remote API and the –cpu-shares option to docker run. Each vCPU is equivalent to 1,024 CPU shares. You must specify at least 1 vCPU.
mountPoints¶
(List, Optional) The mount points for data volumes in your container. This parameter maps to Volumes in the Create a container section of the Docker Remote API and the –volume option to docker run.
containerPath¶
(String, Optional) The path on the container at which to mount the host volume.
readOnly¶
(Boolean, Optional) If this value is True, the container has read-only access to the volume; otherwise, the container can write to the volume. The default value is False.
sourceVolume¶
(String, Optional) The name of the volume to mount.
ulimits¶
(List, Optional) A list of ulimits to set in the container. This parameter maps to Ulimits in the Create a container section of the Docker Remote API and the –ulimit option to docker run.
name¶
(String, Required) The type of the ulimit.
hardLimit¶
(Integer, Required) The hard limit for the ulimit type.
softLimit¶
(Integer, Required) The soft limit for the ulimit type.
volumes¶
(List, Optional) A list of data volumes used in a job.
name¶
(String, Optional) The name of the volume. Up to 255 letters (uppercase and lowercase), numbers, hyphens, and underscores are allowed. This name is referenced in the sourceVolume parameter of container definition mountPoints.
host¶
(Dict, Optional) The contents of the host parameter determine whether your data volume persists on the host container instance and where it is stored. If the host parameter is empty, then the Docker daemon assigns a host path for your data volume, but the data is not guaranteed to persist after the containers associated with it stop running.
sourcePath¶
(String, Optional) The path on the host container instance that is presented to the container. If this parameter is empty, then the Docker daemon has assigned a host path for you. If the host parameter contains a sourcePath file location, then the data volume persists at the specified location on the host container instance until you delete it manually. If the sourcePath value does not exist on the host container instance, the Docker daemon creates it. If the location does exist, the contents of the source path folder are exported.
Variable interpolation in batchbeagle.yml¶
You can use variable replacement in your job definitions to dynamically replace values from your local shell environment.
You can add ${env.<environment var>}
to your service definition anywhere you want the value of the shell environment variable <environment var>
. For example, for the following batchbeagle.yml
snippet:
job_definitions:
- name: test-job-001
container:
image: ${env.IMG_NAME}:${env.IMG_VERSION}
memory: 10
vcpus: 1
batchbeagle –import_env command line option¶
If you run batchbeagle
with the --import_env
option, it will import your shell environment into the batchbeagle environment. Then anything you’ve defined in your shell environment will be available for ${env.VAR}
replacements.
Example:
batchbeagle --import_env <subcommand> [options]