Create an EC2 Image Builder Pipeline with Python Boto3 // Obligatory Tech Blog

Builders Image

EC2 Image Builder makes it easier to automate and manage your image creation process within AWS. Unfortunately, since it is a new feature it doesn’t yet have an option to create a pipeline using CloudFormation. The AWS provided documentation does a good job of walking through using the console and the CLI to create a pipeline but it doesn’t get into examples using the API. The following will show a simple example of creating a pipeline using the Python Boto3 API library.

Prerequisites

Install the Boto3 Library

I like to use PIP with a Virtual Environment to manage packages within a project.

python3 -m venv .env
source .env/bin/activate
pip install boto3

Base Image

You’ll need to either lookup or provide the string for the base AMI to build your custom image from. AWS provides some ready made AMIs to use with image builder that I’ll use in this example. If you do want to create your own base image you need to make sure that the AWS Systems Manager Agent (SSM Agent) is installed as EC2 Image Builder uses this to run commands against the instance being used to build an image.

EC2 IAM Instance Profile

An instance profile is required which could also be created as part of your boto3 code. In this example I’m using a previously created instance profile. The primary permission you’ll need in this role is to allow the Systems Manager communication. Beyond that it will depend entirely on what your image building process will need permission to access. My IAM Instance Profile Role has the AWS Builtin AmazonSSMAutomationRole Policy attached but you may only need what’s available in the AmazonSSMManagedInstanceCore policy. The AmazonSSMManagedInstanceCore would be a good starting point as well to build up your own custom policy.

The Python Code

The first piece that’s needed is at least one Component. A Component is a document written in a YAML format that describes an action for the pipeline to take. These documents allow for running Bash or Powershell directly, downloading and uploading to S3, updating the OS, modifying the Windows registry, execute a binary, or reboot the instance. This provides a great set of building blocks to run almost any action you’d need to within the imaging process. I’m using the yaml Python library in order to create my document as a dictionary and convert to a YAML string.

import boto3
import yaml

client = boto3.client('imagebuilder)

component_data = {
    'name': 'CreateFileAndTestExists',
    'schemaVersion': '1.0',
    'phases': [
        {
            'name': 'build',
            'steps': [
                {
                    'name': 'CreateTestFileStep',
                    'action': 'ExecuteBash',
                    'inputs': {
                        'commands': [
                            'echo "hello world" >> test_file.txt'
                        ]
                    }
                }
            ]
        },
        {
            'name': 'test',
            'steps': [
                {
                    'name': 'TestFileExistsStep',
                    'action': 'ExecuteBash',
                    'inputs': {
                        'commands': [
                            'if [ ! -f test_file.txt ]; then (exit 1); fi'
                        ]
                    }
                }
            ]
        }
    ]
}

I’m using bash commands to simply create a file and then test that the file exists within the image. The test phase will launch an instance with the image created by the build phase to run the tests against.

Create the component and pass in the dictionary used to define the phases dumping to YAML. Most pipelines will probably use multiple components but only one is being used in this example.

component = client.create_component(
    name='CreateFileAndTestExists',
    semanticVersion='1.0.0',
    description='Component created using boto3 API',
    changeDescription='Initial Version',
    platform='Linux',
    data=yaml.dump(component_data)
)

With the component ready an Image Recipe can be defined. This recipe sets the list of components to be used in the pipeline as well as the source image. As previously stated I’m passing in an AWS provided image ARN directly but this could be written as a lookup and could be a custom image ARN instead.

recipe = client.create_image_recipe(
    name='boto-recipe',
    description='Image recipe created with boto3',
    semanticVersion='1.0.0',
    components=[
        {
            'componentArn': component['componentBuildVersionArn']
        },
    ],
    parentImage='arn:aws:imagebuilder:us-west-2:aws:image/amazon-linux-2-x86/2019.11.21'
)

Next, there are two required configurations that need to be setup before creating the full pipeline. A Distribution Configuration and an Infrastructure Configuration. The distribution configuration will tell the pipeline which regions and accounts to distribute the image to. I love how easy this makes it to share an AMI. The template variable {{imagebuilder:buildDate}} is required by Image Builder to ensure uniqueness in the name. Launch Permissions and License Configurations can passed along here too to provide limitations on who can use the image and to make use of AWS License Manager to apply any required licenses to instances created.

distribution = client.create_distribution_configuration(
    name='BotoOutput',
    distributions=[
        {
            'region': 'us-west-2',
            'amiDistributionConfiguration': {
                'name': 'boto-uswest2-{{imagebuilder:buildDate}}'
            }
        }
    ]
)

The infrastructure configuration is used to specify a VPC for the instances used in the build and test process to be created within as well as instance type for these. It’s also where an SNS topic can be defined to send alerts or kick off other actions outside of the pipeline. If you would like to send logs to an S3 bucket, define that here.

infra_config = client.create_infrastructure_configuration(
    name='BotoInfra',
    instanceTypes=['t2.small'],
    instanceProfileName='<YourInstanceProfileRoleName>',
    securityGroupIds=['sg-123456789'],
    subnetId='subnet-123456789',
    logging={
        's3Logs': {
            's3BucketName': '<YourBucket>',
            's3KeyPrefix': '<YourPrefix>'
        }
    },
    snsTopicArn='<YourSNSTopicARN>',
    terminateInstanceOnFailure=True
)

With all of this in place the pipeline can now be defined by passing in the recipe and config ARNs. If you’d like to define a recurring schedule for the pipeline to run that will be defined here. In this example it’s scheduled to run once per day at midnight.

pipeline = client.create_image_pipeline(
    name='BotoPipeline',
    description='Created using the boto3 API from python',
    imageRecipeArn=recipe['imageRecipeArn'],
    infrastructureConfigurationArn=infra_config['infrastructureConfigurationArn'],
    distributionConfigurationArn=distribution['distributionConfigurationArn'],
    imageTestsConfiguration={
        'imageTestsEnabled': True,
        'timeoutMinutes': 60
    },
    schedule={
        'scheduleExpression': 'cron(0 0 * * *)',
        'pipelineExecutionStartCondition': 'EXPRESSION_MATCH_AND_DEPENDENCY_UPDATES_AVAILABLE'
    },
    status='ENABLED'
)

This should provide a good base to begin building a more complex pipeline from. Again, if you need to use CloudFormation, you could run this example as a Lambda Function from a CustomResource within CloudFormation. I’d also recommend reading through the Boto3 Documentation yourself to get a grasp on all of the options available. One important thing to note that I learned during this process is that these configurations are not mutable. You will need to increment the versions of items, create new items, or delete and recreate the items to make updates using the same template. There are API calls to update services that could also be used with some logic within the code. This is something that will become even more important when integrating with CloudFormation CustomResources.