AWS GovCloud (US) is an isolated instance of AWS designed for customers with specific US compliance requirements. AWS GovCloud(US) does have some differences from the commercial regions when it comes tooling.

AWS GovCloud (US) gives government customers and their partners the flexibility to architect secure cloud solutions that comply with the FedRAMP High baseline; the DOJ’s Criminal Justice Information Systems (CJIS) Security Policy; U.S. International Traffic in Arms Regulations (ITAR); Export Administration Regulations (EAR); Department of Defense (DoD) Cloud Computing Security Requirements Guide (SRG) for Impact Levels 2, 4 and 5; FIPS 140-2; IRS-1075; and other compliance regimes.

In this post, I will walk through deploying Kubernetes to AWS GovCloud (US) using kops and configuring the AWS VPC CNI Driver for integrated networking. kops is an open source tool for deploying Kubernetes clusters to the cloud.

This guide will leverage bash to deploy and configure the cluster. kops does support Terraform and CloudFormation, however that is outside the scope of this post.

1. Set Up CLI

In order to deploy, we will need access to a standard (commercial region) AWS account that has access to us-west-2. The registries for the VPC CNI driver are hosted in us-west-2 so we will need to authenticate with them in order to copy our images over to GovCloud. We will also need to identify which GovCloud region we wish to deploy into.

First, define variables for the two regions we will be using for this deployment (we are only deploying into GovCloud).

# define the regions we will be using throughout this guide
export COM_REGION=us-west-2
export GOV_REGION=us-gov-west-1

Next, let’s configure our local CLI for each region. My standard region is configured as sandbox and my GovCloud profile is named govcloud. These profiles are configured with access to my respective accounts. Let’s also capture the ID of our GovCloud account. We will use this later.

# defile the local aws cli profiles we will be using
export COM_PROFILE=sandbox # name of AWS profile with commercial region access
export GOV_PROFILE=govcloud # name of AWS profile with GovCloud access
export GOV_ACCOUNT_ID=$(aws sts get-caller-identity --query="Account" --region=$GOV_REGION --profile=$GOV_PROFILE --output=text)

Next, let’s store some variables that we will use for kops. kops uses an S3 bucket to store state and shared configuration files. We will create the bucket later but for now, create a unique bucket name, define the name of your cluster (this should be a url suffix), and we will pull the latest AMI version of Ubuntu 18.04 LTS from the GovCloud region.

export KOPS_BUCKET_NAME="kops-state-store-123456" # replace 123456 with random string for unique bucket name
export KOPS_NAME="k8s.local" # replace with the dns name of your cluster
export KOPS_IMAGE=$(aws ec2 describe-images --filters "Name=name,Values=ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*" --query="sort_by(Images, &CreationDate)[-1].ImageId" --profile=$GOV_PROFILE --region=$GOV_REGION --output=text)

Finally, we will create the S3 bucket that our configuration will be stored in. I am also enabling versioning and encryption.

aws s3api create-bucket \
  --bucket "$KOPS_BUCKET_NAME" \
  --acl "private" \
  --create-bucket-configuration="LocationConstraint=$GOV_REGION" \
  --region "$AWS_REGION" \
  --profile "$GOV_PROFILE"

aws s3api put-bucket-versioning \
  --bucket $KOPS_BUCKET_NAME \
  --versioning-configuration Status=Enabled \
  --region "$GOV_REGION" \
  --profile "$GOV_PROFILE"

aws s3api put-bucket-encryption \
  --bucket $KOPS_BUCKET_NAME \
  --server-side-encryption-configuration '{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"AES256"}}]}' \
  --region "$GOV_REGION" \
  --profile "$GOV_PROFILE"

We are all set up and ready to move on to the next step.

2. Set Up Registry

As I mentioned earlier, the container images we need are not available within GovCloud so we need to copy them over to our own registry.

Note: GovCloud is isolated so we can’t use our IAM credentials from GovCloud to authenticate to the commercial region. So we have to use the commercial account we configured earlier to do the copy.

To start, let’s create an ECR repository for the VPC CNI driver. We will also enable image scanning so that we can be alerted if there are any vulnerabilities that we need to patch down the road.

aws ecr create-repository \
  --region=$GOV_REGION \
  --profile=$GOV_PROFILE \
  --repository-name "amazon-k8s-cni" \
  --image-scanning-configuration "{ \"scanOnPush\": true }"

Next, we will authenticate to the commercial region and GovCloud region ECR registries.

aws ecr get-login-password --region=$COM_REGION --profile=$COM_PROFILE | docker login -u AWS --password-stdin 602401143452.dkr.ecr.$COM_REGION.amazonaws.com

aws ecr get-login-password --region=$GOV_REGION --profile=$GOV_PROFILE | docker login -u AWS --password-stdin $GOV_ACCOUNT_ID.dkr.ecr.$COM_REGION.amazonaws.com

Finally, we will pull the latest version of the CNI driver, retag it for our GovCloud region, and push it to our new GovCloud registry.

docker pull 602401143452.dkr.ecr.$COM_REGION.amazonaws.com/amazon-k8s-cni:v1.6.1

docker tag 602401143452.dkr.ecr.$COM_REGION.amazonaws.com/amazon-k8s-cni:v1.6.1 $GOV_ACCOUNT_ID.dkr.ecr.$GOV_REGION.amazonaws.com/amazon-k8s-cni:v1.6.1

docker push $GOV_ACCOUNT_ID.dkr.ecr.$GOV_REGION.amazonaws.com/amazon-k8s-cni:v1.6.1

We are all set to deploy our cluster.

3. Deploy Cluster

We have now setup our S3 Bucket to store our configuration, copied our container images, and are now ready to deploy our cluster. For this example, I am deploying a highly available cluster to all three availability zones within GovCloud and launching 6 worker nodes.

Notice we are using dns=private. By default, kops use Route53 to configure publicly available DNS endpoints for your cluster. GovCloud (US) doesn’t provide public DNS so we must use a private hosted zone. We are also using topology=rpivate which will deploy the cluster to entirely private subnets with no public access other than the API server for remote administration.

Feel free to customize the counts or instance types below. Once you are all set, run the following command to generate the cluster configuration, this will not deploy any infrastructure.

kops create cluster --name="$KOPS_NAME" \
  --state="s3://$KOPS_BUCKET_NAME" \
  --image="$KOPS_IMAGE" \
  --zones="${GOV_REGION}a,${GOV_REGION}b,${GOV_REGION}c" \
  --cloud="aws" \
  --dns="private" \
  --master-count="3" \
  --encrypt-etcd-storage \
  --master-size "t3.large" \
  --master-zones "${GOV_REGION}a,${GOV_REGION}b,${GOV_REGION}c" \
  --node-size="t3.large" \
  --node-count 6 \
  --networking="amazon-vpc-routed-eni" \
  --topology="private"

Before we deploy, we need to make a few tweaks to the configuration. Run the following command to edit the cluster configuration, this will use your default editor (I have mine configured to be Visual Studio Code).

kops edit cluster --name="$KOPS_NAME" --state="s3://$KOPS_BUCKET_NAME"

First, we need to edit the networking configuration to use our image that we copied over to our registry. Replace ${GOV_ACCOUNT_ID} and ${GOV_REGION} with the values for your configuration.

spec:
  networking:
    amazonvpc:
      imageName: ${GOV_ACCOUNT_ID}.dkr.ecr.${GOV_REGION}.amazonaws.com/amazon-k8s-cni:v1.6.1

Next, we need to add additional policies for AWS Systems Manager. This will enable us to access the instances in the cluster without having SSH enabled.This cluster is designed to be entirely private. Simply copy and paste the configuration below to the bottom of the configuration file.

spec:
  additionalPolicies:
    master: |
      [
        {
          "Effect":"Allow",
          "Action":[
            "ssm:DescribeAssociation",
            "ssm:GetDeployablePatchSnapshotForInstance",
            "ssm:GetDocument",
            "ssm:DescribeDocument",
            "ssm:GetManifest",
            "ssm:GetParameters",
            "ssm:ListAssociations",
            "ssm:ListInstanceAssociations",
            "ssm:PutInventory",
            "ssm:PutComplianceItems",
            "ssm:PutConfigurePackageResult",
            "ssm:UpdateAssociationStatus",
            "ssm:UpdateInstanceAssociationStatus",
            "ssm:UpdateInstanceInformation"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "ssmmessages:CreateControlChannel",
            "ssmmessages:CreateDataChannel",
            "ssmmessages:OpenControlChannel",
            "ssmmessages:OpenDataChannel"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "ec2messages:AcknowledgeMessage",
            "ec2messages:DeleteMessage",
            "ec2messages:FailMessage",
            "ec2messages:GetEndpoint",
            "ec2messages:GetMessages",
            "ec2messages:SendReply"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "cloudwatch:PutMetricData"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "ec2:DescribeInstanceStatus"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "ds:CreateComputer",
            "ds:DescribeDirectories"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:DescribeLogGroups",
            "logs:DescribeLogStreams",
            "logs:PutLogEvents"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "s3:GetBucketLocation",
            "s3:PutObject",
            "s3:GetObject",
            "s3:GetEncryptionConfiguration",
            "s3:AbortMultipartUpload",
            "s3:ListMultipartUploadParts",
            "s3:ListBucket",
            "s3:ListBucketMultipartUploads"
          ],
          "Resource":[
            "*"
          ]
        }
      ]      
    node: |
      [
        {
          "Effect":"Allow",
          "Action":[
            "ssm:DescribeAssociation",
            "ssm:GetDeployablePatchSnapshotForInstance",
            "ssm:GetDocument",
            "ssm:DescribeDocument",
            "ssm:GetManifest",
            "ssm:GetParameters",
            "ssm:ListAssociations",
            "ssm:ListInstanceAssociations",
            "ssm:PutInventory",
            "ssm:PutComplianceItems",
            "ssm:PutConfigurePackageResult",
            "ssm:UpdateAssociationStatus",
            "ssm:UpdateInstanceAssociationStatus",
            "ssm:UpdateInstanceInformation"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "ssmmessages:CreateControlChannel",
            "ssmmessages:CreateDataChannel",
            "ssmmessages:OpenControlChannel",
            "ssmmessages:OpenDataChannel"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "ec2messages:AcknowledgeMessage",
            "ec2messages:DeleteMessage",
            "ec2messages:FailMessage",
            "ec2messages:GetEndpoint",
            "ec2messages:GetMessages",
            "ec2messages:SendReply"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "cloudwatch:PutMetricData"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "ec2:DescribeInstanceStatus"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "ds:CreateComputer",
            "ds:DescribeDirectories"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:DescribeLogGroups",
            "logs:DescribeLogStreams",
            "logs:PutLogEvents"
          ],
          "Resource":[
            "*"
          ]
        },
        {
          "Effect":"Allow",
          "Action":[
            "s3:GetBucketLocation",
            "s3:PutObject",
            "s3:GetObject",
            "s3:GetEncryptionConfiguration",
            "s3:AbortMultipartUpload",
            "s3:ListMultipartUploadParts",
            "s3:ListBucket",
            "s3:ListBucketMultipartUploads"
          ],
          "Resource":[
            "*"
          ]
        }
      ]      

That’s it! We are ready to deploy the cluster to GovCloud. Run the following command to deploy the cluster. It will take about 10 minutes to spin up the infrastructure and another 5-10 minutes for the cluster to be up and ready to go.

kops update cluster --name="$KOPS_NAME" --state="s3://$KOPS_BUCKET_NAME" --yes

Once the deployment is complete you should be run the following command and see a list of nodes in your cluster! You will also be able to login to AWS Session Manager and start shell access to all of the hosts within the cluster.

kubectl get nodes

Your cluster is up and running and you are ready to start deploying workloads. Some additional items you may want to configure are aws-iam-authenticator, kube2iam, and the ALB Ingress Controller.

Thanks for following along, if you have any issues don’t hesitate to reach out on Twitter!