AppMesh and ECS With Imported ACM Certificates on Envoy Sidecar Through EFS

  • Home /
  • Blog Posts /
  • AppMesh and ECS with Imported ACM certificates on Envoy Sidecar through EFS

Summary

This guide showcases the ability to use imported certificates from a third party provider (e.g. Venafi) in ACM, mount them in EFS and use them as trusted sources on Envoy sidecars with applications running in ECS. AppMesh is used as a passthrough with TLS termination occurring on the application container layer.

Prerequisites and limitations

Prerequisites

A certificate that contains the chain of domains required for the fronted service and micro-services needed.

What we will produce:

  • ACM containing an Imported Certificate.
  • EFS volume.
  • Route53 record.
  • Network Load Balancer, with associated Target Group.
  • ECS cluster, with Tasks managed by a Service. A Task Definition to compound the mapping criteria.
  • AppMesh Virtual Gateway, Virtual Service and Virtual Node pointing back to the ECS task containers.
  • CloudMap to integrate ECS and AppMesh configurations with automation.
  • Bastion host used for testing purposes.

Architecture

Target technology stack

ACM, EFS, Route53, NLB, TG, ECS, AppMesh, CloudMap

Target architecture

Tools

N/A

Best practices

ACM – Certificate Manager

Certificates are imported from Venafi (third party provider):

Drilling into this information, the domains listed contain sufficient subdomains to address the micro-services oriented architecture.

EFS

AppMesh does not support ACM PCM Certificates directly, so they are loaded onto an EFS volume that will be mounted on the Envoy sidecar containers.

Route53

A hosted zone is setup in Route53 to be able to route traffic from our primary domain to a Network Load Balancer.

LoadBalancer

This Network Load Balancer is setup as internal to allow for controlled internal traffic only.

There is a single listener open on port 443:

Target Group

The Target Group routes traffic to the application port on two ECS tasks behind our ECS service.

The health check confirms access on the defined traffic port, which is the application container port for ECS.

ECS

Each service fronts it’s own microservice application, which consists of an application container and an envoy sidecar.

The service contains multiple tasks to distribute load.

Multiple containers reside within each task definition.

Network bindings are setup to allow traffic through the application ports that were setup previously in the target groups.

Setting up Envoy to be able to validate the certificates for application TLS termination is important. To do this, an envoy task definition may look something like this:

{
    "taskDefinitionArn": "arn:aws:ecs:af-south-1:xxxxxx:task-definition/envoy-task:12",
    "containerDefinitions": [
        {
            "name": "envoy",
            "image": "xxxxx.dkr.ecr.af-south-1.amazonaws.com/aws-appmesh-envoy:v1.22.2.1-prod",
            "cpu": 0,
            "memory": 500,
            "portMappings": [
                {
                    "containerPort": 8443,
                    "hostPort": 8443,
                    "protocol": "tcp"
                },
                {
                    "containerPort": 8080,
                    "hostPort": 8080,
                    "protocol": "tcp"
                },
                {
                    "containerPort": 9901,
                    "hostPort": 9901,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "APPMESH_VIRTUAL_NODE_NAME",
                    "value": "mesh/VAX/virtualGateway/om-xxx-vgw"
                },
                {
                    "name": "ENVOY_LOG_LEVEL",
                    "value": "debug"
                }
            ],
            "mountPoints": [
                {
                    "sourceVolume": "cert-vol",
                    "containerPath": "/certs",
                    "readOnly": true
                }
            ],
            "volumesFrom": [],
            "user": "1337",
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/ecs/envoy-task",
                    "awslogs-region": "af-south-1",
                    "awslogs-stream-prefix": "ecs"
                }
            },
            "healthCheck": {
                "command": [
                    "CMD-SHELL",
                    "curl -s http://localhost:9901/server_info | grep state | grep -q LIVE"
                ],
                "interval": 5,
                "timeout": 2,
                "retries": 3,
                "startPeriod": 60
            }
        }
    ],
    "family": "envoy-task",
    "taskRoleArn": "arn:aws:iam::xxxxxx:role/Bounded-AmazonECSTaskExecutionRole",
    "executionRoleArn": "arn:aws:iam::xxxxxx:role/Bounded-AmazonECSTaskExecutionRole",
    "networkMode": "awsvpc",
    "revision": 12,
    "volumes": [
        {
            "name": "cert-vol",
            "efsVolumeConfiguration": {
                "fileSystemId": "fs-01c20c20xxxxd3",
                "rootDirectory": "/",
                "transitEncryption": "ENABLED",
                "authorizationConfig": {
                    "accessPointId": "fsap-06a57e7xxx1d439",
                    "iam": "DISABLED"
                }
            }
        }
    ],
    "status": "ACTIVE",
    "requiresAttributes": [
        {"name": "ecs.capability.execution-role-awslogs"},
        {"name": "com.amazonaws.ecs.capability.ecr-auth"},
        {"name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"},
        {"name": "com.amazonaws.ecs.capability.task-iam-role"},
        {"name": "ecs.capability.container-health-check"},
        {"name": "ecs.capability.execution-role-ecr-pull"},
        {"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"},
        {"name": "ecs.capability.task-eni"},
        {"name": "com.amazonaws.ecs.capability.docker-remote-api.1.29"},
        {"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"},
        {"name": "ecs.capability.efsAuth"},
        {"name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"},
        {"name": "ecs.capability.efs"},
        {"name": "com.amazonaws.ecs.capability.docker-remote-api.1.25"}
    ],
    "placementConstraints": [],
    "compatibilities": [
        "EC2",
        "FARGATE"
    ],
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "1024",
    "memory": "2048",
    "runtimePlatform": {
        "operatingSystemFamily": "LINUX"
    },
    "registeredAt": "20xx-08-31T12:01:xx.525Z",
    "registeredBy": "arn:aws:sts::xxxx:assumed-role/XXXUsrRole/[email protected]",
    "tags": []
}

AppMesh

There is a single Mesh defined.

Mesh

In this setup, we make use of Virtual Gateways, Virtual Services and Virtual Nodes to route back to running ECS services.

Virtual Gateway

A single virtual gateway is provisioned.

The configuration of which mounts the EFS volume’s certificate chain, and acts as a passthrough, or permissive traffic flow.

om-vas-vgw

meshName: VAS
virtualGatewayName: om-vas-vgw
spec:
  backendDefaults:
    clientPolicy: {}
  listeners:
    - portMapping:
        port: 8443
        protocol: http
      tls:
        certificate:
          file:
            certificateChain: /certs/vas-api-service.example.com.crt
            privateKey: /certs/new.key
        mode: PERMISSIVE
    - portMapping:
        port: 8080
        protocol: http
  logging:
    accessLog:
      file:
        path: /dev/std

Listeners:
Listeners of which, are setup for both TLS and non-TLS, entirely for testing purposes during development phases only.

Gateway Routes

A gateway route is setup to route http type traffic through to a virtual service defined below.

vas-api-service-route:

meshName: VAS
virtualGatewayName: om-vas-vgw
gatewayRouteName: vas-api-service-route
spec:
  httpRoute:
    action:
      rewrite:
        hostname:
          defaultTargetHostname: DISABLED
        prefix:
          defaultPrefix: ENABLED
      target:
        virtualService:
          virtualServiceName: om-vas-api-vsvc
    match:
      port: 8443
      prefix: /

The virtual service is hooked up to a virtual node through the below configuration.
om-vas-api-vsv:

meshName: VAS
virtualServiceName: om-vas-api-vsvc
spec:
  provider:
    virtualNode:
      virtualNodeName: om-vas-api-server-vnode

Virtual Node:

The virtual node allows traffic to pass through to the application port on 34559 as shown below.

meshName: VAS
virtualNodeName: om-vas-api-server-vnode
spec:
  backendDefaults:
    clientPolicy:
      tls:
        enforce: false
        ports: []
        validation:
          trust:
            file:
              certificateChain: /certs/vas-api-service.example.com.crt
  backends: []
  listeners:
    - healthCheck:
        healthyThreshold: 3
        intervalMillis: 10000
        path: /
        port: 34559
        protocol: tcp
        timeoutMillis: 5000
        unhealthyThreshold: 2
      portMapping:
        port: 34559
        protocol: tcp
  logging: {}
  serviceDiscovery:
    awsCloudMap:
      attributes: []
      namespaceName: example.com
      serviceName: vas-api-service

Virtual Node Listeners:

A visual representation is as follows:

CloudMap

CloudMap provides service discovery for our resources, we start with a namespace which can be used for API calls and DNS queries within the VPC.
We have created a namespace to house our collective resources.

Here we can see the Service Instances that ECS tasks are reporting back to us.

If we look at one of them, we can see the information that will inform AppMesh:

Confirming traffic flow

Running the following connection tests through a Bastion allows us to stay within the same internal network for all tests.

Now we trigger the service directly on ECS to see the certificate is accepted:

sh-4.4$ curl -I https://vas-api-service.example.com:34559/swagger-ui/
HTTP/1.1 200 OK
Last-Modified: Wed, 20 Jul 2022 13:15:06 GMT
Content-Length: 3129
Accept-Ranges: bytes
Content-Type: text/html

Then we can test that the actual front service through the chain starting with Route53 connects successfully:

sh-4.4$ curl -I https://vas.example.com/swagger-ui/
HTTP/1.1 200 OK
Last-Modified: Wed, 20 Jul 2022 13:15:06 GMT
Content-Length: 3129
Accept-Ranges: bytes
Content-Type: text/html

Finally we make sure that the connection directly from the load balancer does not allow ingress:

sh-4.4$ curl -I https://om-vas-service-nlb-be13b4dccxxxxxx.elb.af-south-1.amazonaws.com/swagger-ui/
curl: (51) SSL: no alternative certificate subject name matches target host name 'om-vas-service-nlb-be13b4dccxxxxx.elb.af-south-1.amazonaws.com'
sh-4.4$