As cloud infrastructure becomes increasingly complex, many DevOps teams use AWS with Atlantis to automate Terraform workflows. This open-source tool links Git pull requests to Terraform operations. It helps teams improve Infrastructure as Code practices across different environments. It also helps maintain governance on a large scale.
Terraform is widely adopted for provisioning AWS infrastructure—but as environments grow, teams encounter new layers of complexity:
- Multiple DevOps teams making concurrent changes
- Hundreds of thousands of resources across accounts
- Complex dependencies between modules and services
- Security, IAM, and compliance constraints
- Need for consistent, auditable deployments at scale
Many teams start with Atlantis—but as infrastructure scales, so do the limitations. This post is your deep-dive guide to scaling Terraform on AWS with Atlantis—and making it work in high-scale, multi-team environments.
👉 Want to explore alternative tools beyond Atlantis? Read our comparison blog
What is Atlantis?
Atlantis is an open-source tool that automates the Terraform workflow using pull requests. It bridges your version control system (GitHub, GitLab, or Bitbucket) and Terraform execution and enables collaborative infrastructure development.
How Atlantis Works with Terraform
Atlantis listens for webhook events in your repository hosting service. When a pull request modifies Terraform configuration files, Atlantis automatically:
- Runs terraform plan on the changed files
- Post a comment directly on the pull request
- Provides a mechanism to deliver changes by commenting
- Lock workspaces to prevent multiple concurrent changes
Here’s a typical diagram of where Atlantis fits within your workflow:

Key Features of Atlantis:
- Pull Request-based Workflow: Atlantis syncs your Git repository and automatically triggers Terraform runs on open or updated pull requests.
- Approval Process: Atlantis integrates support for approval workflow so that teams may audit Terraform plans before deployment to guarantee that modifications are compliant and secure.
- Multi-Tenant Support: It enables multiple Terraform configurations for different environments so that multiple teams are unaffected by each other.
- State Locking: Terraform handles state locking internally to prevent concurrent runs from overriding each other.
To see how Atlantis compares to other Terraform automation tools, check out our in-depth Atlantis alternatives guide.
5 Best Practices for Scaling Terraform with AWS Atlantis
Before diving into Terraform scaling on AWS with Atlantis, you need to understand some basics about the tool. Here are five key points about Atlantis to help you start scaling your Terraform workflow:
1. Use Terraform Workspaces for Multi-Environment
When dealing with large AWS infrastructures, you must split your Infrastructure into multiple environments (e.g., dev, staging, production). Terraform workspaces fit well in Atlantis. You can have multiple state files for different environments. This allows you to keep one large codebase.
Example of Workspace Configuration:
terraform workspace new dev
terraform workspace select dev
terraform apply -var="environment=dev"
2. Custom Workflows for Complex Pipelines
Atlantis’s default workflow (plan → apply) works for simple cases, but complex Infrastructure often requires custom steps:
Custom workflow definition in atlantis.yaml:
1 2 3 4 5 6 7 8 9 10 11 12 |
workflows: custom: plan: steps: - run: terraform init -input=false - run: terraform validate - run: terraform plan -input=false -out=$PLANFILE - run: aws s3 cp $PLANFILE s3://terraform-audit-bucket/plans/$WORKSPACE-$PULL_NUM.tfplan apply: steps: - run: terraform apply -input=false $PLANFILE - run: ./notify-slack.sh "Applied changes to $WORKSPACE by $USER" |
3. Handling State Files Securely
Scaling and managing Terraform state becomes critical and Atlantis works best with remote state storage:
terraform {
1 2 3 4 5 6 7 8 9 |
terraform { backend "s3" { bucket = "terraform-state-${var.environment}" key = "network/terraform.tfstate" region = "us-east-1" dynamodb_table = "terraform-locks" encrypt = true } } |
4. Security and Access Control for Atlantis
Atlantis also facilitates using SSH and IAM roles to secure AWS communications. Atlantis also allows you to lock down who will approve and execute Terraform plans as a security and accountability mechanism. You also can establish AWS IAM roles in Atlantis to communicate with AWS resources securely.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
resource "aws_iam_role" "atlantis" { name = "atlantis-execution-role" assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [{ Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "ec2.amazonaws.com" } }] }) } resource "aws_iam_role_policy_attachment" "atlantis_policy" { role = aws_iam_role.atlantis.name policy_arn = "arn:aws:iam::aws:policy/PowerUserAccess" } |
Assuming Different Roles for Different Environments
1 2 3 4 5 6 7 8 |
#In your provider configuration provider "aws" { region = "us-west-2" assume_role { role_arn = "arn:aws:iam::${var.account_id}:role/TerraformExecutionRole" } } |
5. Automating Terraform Plans and Applies
Using Atlantis after you set up Atlantis on your Git repository, the Terraform plan runs automatically. This happens for all updated or opened PRs. Atlantis also has a provision to apply Terraform changes directly once the PR has been approved. This removes the necessity for Terraform to run within the CI/CD pipeline.
AWS Atlantis Challenges When Scaling Terraform
1. Slow Plan and Apply Times
When the Infrastructure grows, Terraform operations begin to slow. Large infrastructures have 5-10-min or longer plans that act as bottlenecks.
Solution: Use Workspace Splitting
Divide monolithic designs into separate, focused work areas:
atlantis.yaml with parallel execution:
1 2 3 4 5 6 7 8 9 10 |
version: 3 parallel_plan: true parallel_apply: true projects: - name: networking dir: networking - name: databases dir: databases - name: compute dir: compute |
2: Managing Permissions Across Multiple AWS Accounts
In the case of multiple AWS accounts, managing permissions becomes complex.
Solution: Use Cross-Account Role Assumption
Create roles in each account that Atlantis can assume
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
resource "aws_iam_role" "terraform_execution_role" { name = "terraform-execution-role" assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [{ Action = "sts:AssumeRole" Effect = "Allow" Principal = { AWS = "arn:aws:iam::${var.atlantis_account_id}:role/atlantis-role" } }] }) } |
#In your provider configuration
1 2 3 4 5 6 7 8 |
provider "aws" { alias = "production" region = "us-west-2" assume_role { role_arn = "arn:aws:iam::${var.production_account_id}:role/terraform-execution-role" } } |
3: Managing Terraform Version Compatibility
As your Infrastructure expands, it becomes challenging to manage Terraform version updates.
Solution: Use Terraform Version Control with Atlantis
1 2 3 4 5 6 7 8 9 10 |
#atlantis.yaml version: 3 projects: - name: legacy-system dir: legacy terraform_version: 0.14.11 - name: new-system dir: new terraform_version: 1.5.7 |
4: Sensitive Variable Control
Managing secrets securely with Terraform and Atlantis requires careful consideration.
Solution: AWS Secrets Manager Integration
Create a wrapper script for Terraform that fetches secrets:
1 2 3 4 5 6 7 8 |
#!/bin/bash fetch-secrets.sh Get database password from Secrets Manager DB_PASSWORD=$(aws secretsmanager get-secret-value --secret-id db/password --query SecretString --output text) Export as environment variable for Terraform export TF_VAR_db_password="$DB_PASSWORD" |
Execute terraform with all arguments passed to this script
1 2 3 4 5 6 7 8 9 10 11 |
terraform "$@" Then update your Atlantis workflow: workflows: secure: plan: steps: - run: ./fetch-secrets.sh init -input=false - run: ./fetch-secrets.sh plan -input=false -out=$PLANFILE apply: steps: - run: ./fetch-secrets.sh apply -input=false $PLANFILE |
How Teams Automate Workflows to Scale Terraform Deployments on AWS
Step 1: Implement Repository Structure for Scale
Organize your Terraform code for maximum parallelization and clear ownership:
Step 2: Set Up Advanced Atlantis Configuration
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
#atlantis.yaml version: 3 automerge: true delete_source_branch_on_merge: true parallel_plan: true parallel_apply: true workflows: production: plan: steps: - run: terraform init -input=false - run: terraform validate - run: terraform plan -input=false -out=$PLANFILE - run: ./policy-check.sh apply: steps: - run: ./pre-apply-checks.sh - run: terraform apply -input=false $PLANFILE - run: ./post-apply-validation.sh - run: ./notify-teams.sh "$WORKSPACE changes applied by $USER" projects: - name: prod-network dir: accounts/production/networking workflow: production autoplan: when_modified: ["*.tf", "../../../modules/networking/**/*.tf"] apply_requirements: ["approved", "mergeable"] - name: prod-databases dir: accounts/production/databases workflow: production autoplan: when_modified: ["*.tf", "../../../modules/database/**/*.tf"] apply_requirements: ["approved", "mergeable"] #Additional projects would be defined similarly |
Step 3: Implement Dependency Management
Create a script to manage dependencies between projects:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
#!/bin/bash dependency-manager.sh Define dependencies declare -A dependencies dependencies["prod-compute"]="prod-network prod-databases" dependencies["staging-compute"]="staging-network staging-databases" Check if dependencies have been successfully applied check_dependency() { local dependency=$1 local status=$(curl -s "http://atlantis-server:4141/api/projects/$dependency" | jq -r '.status') if [[ "$status" == "applied" ]]; then return 0 else return 1 fi } Check all dependencies for the current project PROJECT_NAME=$1 if [[ -n "${dependencies[$PROJECT_NAME]}" ]]; then for dep in ${dependencies[$PROJECT_NAME]}; do if ! check_dependency "$dep"; then echo "Dependency $dep is not in applied state. Cannot proceed." exit 1 fi done fi If we get here, all dependencies are met echo "All dependencies satisfied, proceeding with Terraform operation" exit 0 |
Step 4: Implement Drift Detection
Create a scheduled task to detect infrastructure drift:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
resource "aws_cloudwatch_event_rule" "drift_detection" { name = "terraform-drift-detection" description = "Triggers Terraform drift detection" schedule_expression = "cron(0 4 ? *)" # Run daily at 4 AM } resource "aws_cloudwatch_event_target" "drift_detection_lambda" { rule = aws_cloudwatch_event_rule.drift_detection.name target_id = "DriftDetectionLambda" arn = aws_lambda_function.drift_detection.arn } resource "aws_lambda_function" "drift_detection" { function_name = "terraform-drift-detection" role = aws_iam_role.drift_detection_lambda.arn handler = "index.handler" runtime = "nodejs16.x" timeout = 300 environment { variables = { ATLANTIS_URL = "https://atlantis.controlmonkey.com" GITHUB_TOKEN = "{{resolve:secretsmanager:github/token:SecretString:token}}" } } } |
Step 5: Implement Approval Workflows with AWS Services
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
resource "aws_lambda_function" "approval_notification" { function_name = "terraform-approval-notification" role = aws_iam_role.approval_lambda.arn handler = "index.handler" runtime = "nodejs16.x" environment { variables = { SNS_TOPIC_ARN = aws_sns_topic.terraform_approvals.arn } } } resource "aws_sns_topic" "terraform_approvals" { name = "terraform-approval-requests" } resource "aws_sns_topic_subscription" "approval_email" { topic_arn = aws_sns_topic.terraform_approvals.arn protocol = "email" endpoint = "infrastructure-team@example.com" } resource "aws_api_gateway_resource" "webhook" { rest_api_id = aws_api_gateway_rest_api.atlantis_extensions.id parent_id = aws_api_gateway_rest_api.atlantis_extensions.root_resource_id path_part = "webhook" } resource "aws_api_gateway_method" "webhook_post" { rest_api_id = aws_api_gateway_rest_api.atlantis_extensions.id resource_id = aws_api_gateway_resource.webhook.id http_method = "POST" authorization_type = "NONE" } |
What If Atlantis with AWS Isn’t Enough?
If your team is managing thousands of Terraform resources, dozens of AWS accounts, or struggling with policy enforcement and visibility—you may have outgrown Atlantis.
While Atlantis is a solid open-source tool for automating Terraform plans and applies through pull requests, it wasn’t designed for enterprise-scale cloud governance. Teams scaling Terraform on AWS often face challenges around:
- Large, complex configurations
- Multi-account IAM permissions
- Policy enforcement and compliance gaps
- ClickOps and infrastructure drift
This is where a platform like ControlMonkey comes in—offering full visibility, automated drift detection, real-time policy enforcement, and Terraform CI/CD that works across cloud and code.
Infrastructure automation should grow with your cloud footprint. If Atlantis is slowing you down, it’s time to explore what’s next.
👉 Book a demo and see how ControlMonkey scales what Atlantis started.

FAQs
Atlantis helps DevOps teams automate Terraform workflows by triggering plan and apply via pull requests. When used with the AWS provider, it allows teams to apply changes across AWS accounts consistently—without embedding Terraform directly into CI/CD pipelines.
Atlantis wasn’t designed for large-scale, multi-account AWS environments. Teams often run into slow plan times, complex IAM role setups, and limited policy enforcement. For advanced use cases, many teams adopt additional tools to handle drift detection, security, and governance at scale.