9 min read

The Definitive Guide For Terraform Drift Detection

Ori Yemini
Ori Yemini

CTO & Co-Founder

Ori Yemini
Ori Yemini

CTO & Co-Founder

In the evolving world of Infrastructure as Code (IaC), Terraform has made its mark. It’s now a top choice for provisioning and managing cloud resources. As more organizations embrace Terraform for their infrastructure automation, there’s a pressing need. They must consistently ensure that the infrastructure’s actual state aligns with its intended state. This challenge brings Terraform drift detection to the forefront.

What is Terraform drift? It’s when your live infrastructure deviates from what’s described in your Terraform configuration files. Such deviations can happen in multiple ways. One common cause is changes made outside Terraform’s oversight. Imagine manual adjustments made via a console or the CLI. These changes are on the cloud, but they’re absent from the Terraform state file, causing a mismatch.

Why does Terraform drift detection matter, especially for those managing AWS infrastructure?
The reasons are compelling. It helps you identify and tackle these mismatches. By doing so, you bolster security, ensure compliance, and boost both functionality and reliability. In this article, we’ll navigate these crucial areas.

What Does Terraform Drift Look Like?

Let’s start by talking about Terraform state file. So Terraform state file holds details about resources that Terraform has either created or managed. Within this file, you’ll find information like resource type, name, provider, configuration, and its present state. Terraform relies on this state file to track infrastructure changes and ensure everything aligns with your specific configurations.

Terraform drift emerges when there’s a disparity between how Terraform understands a resource’s state (based on the state file) and how that resource actually exists in reality.

Comparing Real-Time Infrastructure to the Terraform State File

So how do we identify such a drift? The Terraform state file serves as a snapshot of your infrastructure as overseen by Terraform.
By comparing this state file with the actual status of your infrastructure, you can pinpoint any alterations made outside of Terraform’s purview such as manual changes to the infrastructure and purposeful new changes defined in the terraform configuration file to facilitate this, you can run the ‘Terraform plan’ command, which calculates a new state file (in memory) with the existing state of your infrastructure and then compares it to the desired state defined in the Terraform configuration file. The output of the command shows the changes between the actual state and the desired state, i.e. the drift.


Spotting Terraform Drift in Action: A Practical Example

Let’s dive into a hands-on illustration to understand drift detection more concretely.

Step 1: Initial Configuration with Terraform

We use Terraform to set up a security group. In the initial code, only one rule is configured, bearing the description “SSH from VPC” :

Upon deploying this Terraform code, the AWS console will reflect this setup:

Step 2: Manual Modifications in AWS Console

Imagine a scenario where someone, maybe due to urgent requirements, goes directly into the AWS console and adds two new rules to the security group: HTTP and HTTPS.

Step 3: Terraform’s Response to Manual Changes

Post the manual changes, when you run the “terraform plan” command, Terraform detects the drift between its state file and the current AWS setup. The terminal indicates “1 to change”, signifying this disparity.


Reasons for Terraform Drift

1. Lack of Automation

Drift can arise when there’s an absence of systematic processes. When there’s no automation in place, infrastructure changes often happen manually, leading to potential errors and inconsistencies. Implementing a CI/CD (Continuous Integration/Continuous Deployment) system ensures changes are made consistently, tested, and deployed automatically. This minimizes drift by streamlining the creation, testing, and deployment of code. Furthermore, without a clear strategy for updating AWS infrastructure, ad hoc changes might go undocumented, laying the groundwork for drift.

2. Urgency of Hotfixes

When urgent issues arise, hotfixes can be a quick solution. However, when these fixes are applied manually, especially directly through the AWS interface, they might bypass regular procedures. Such changes can introduce drift as they aren’t reflected in the Terraform state file. For example, this may occur if an on-call team member fixes a resource configuration directly from the AWS console to address a production bug reported at 2 am in the morning. 

3. Insufficient Team Training

A well-informed team is crucial for maintaining consistent infrastructure. If team members, unfamiliar with Terraform, opt to make updates directly through the AWS console, it creates a blind spot for Terraform. Since Terraform doesn’t recognize these console-based changes, drift can unintentionally be introduced.

4. Third-party Automation

Not all automation tools are created equal, and using third-party automation software can be a Terraform drift culprit. For instance,  imagine a third-party security system that uses AWS CLI or AWS SDK to modify a rule in one of your AWS security groups. 
These tools don’t have access to the Terraform configuration file and can’t really change the Terraform code, so the change is not updated in it. 
In this situation, the Terraform state file won’t reflect the current condition of your infrastructure, thus causing a drift.

Implications of Terraform Drift

Security at Risk

Infrastructure drift in cloud systems can lead to heightened security risks. For instance, security group rules might be manually adjusted for testing purposes, granting unintended public access.
The consequences? Potential data breaches, financial setbacks, compliance breaches, and damage to the company’s reputation.

Compliance in Jeopardy

Terraform drift can cause breaches in compliance, introducing significant risks that deeply impact a business. This unmanaged change might alter operational practices, compromise data integrity, or weaken security safeguards. For example, if drift results in the unintentional public disclosure of user data or unauthorized resource access, it can lead to major compliance concerns.

Financial Implications

Infrastructure drift can also have financial implications. Human-induced changes might increase operational costs.
Imagine a team member who changes an RDS DB instance type from “db.m5.xlarge” to “db.m5.4xlarge”.
This means the organization is going to pay four times more than the expected price.
Let’s look at the numbers from the AWS pricing page for the above change to understand the cost implication better.
For example:

  • Pricing for db.m5.xlarge DB server::
  • Pricing for db.m5.4xlarge DB server:

In this example, the organization is going to pay an extra 10,000 USD due to the unintentional RDS instance type change. 

Irrelevance of Terraform Code

Significant drift can make Terraform code obsolete.
When Terraform code becomes outdated, it indicates that the code you created to build and maintain your systems is no longer correct or current with the way the systems really appear and work.
Your code is not relevant anymore and you can’t use it to manage your Infrastructure.
Your code becomes less relevant as the drift gets older and larger.

How to Detect Terraform Drifts

Stay one step ahead of infrastructure drifts using the terraform plan command.
Think of it as a health check-up for your setup. It assesses how your current infrastructure measures up against the blueprint laid out in your Terraform configurations.
Spot a mismatch? This command will point out the changes needed to bridge that gap.

Periodic Checks

Regular checks, a few times a day or at least once a day, are a good habit. It’s like routine maintenance; catch the small issues before they snowball into bigger ones.

Visibility and Notification

To guarantee the security and compliance of your infrastructure, you must have a reliable visibility and alerting mechanism for Terraform drifts. You can instantly spot any problems that might cause risk.
A dashboard is an excellent tool for displaying all open drifts.
Such a dashboard should provide details on the type of drift, the resources impacted, and the drift severity.
The importance of the drift should be determined by its possible effects, such as whether it may result in a compliance or security breach.

Another good practice is to have a notification system in place to keep you updated on any new drifts. This ought to allow users to get notifications by email or other channels.


Terraform Drift Remediation: Two Effective Approaches

1. Reconcile

What’s the goal? Return everything to how the original Terraform code intended it to be.

When to use? Best for when changes, made outside the Terraform code, need to be reversed.

Here’s a Simple Breakdown:

  • Your Terraform code sets up an EC2 instance type as t2.micro:
  • This leads to the creation of the mentioned EC2 instance:
  • Solution? Run ‘Terraform Apply’ again. It identifies the drift and swiftly rebuilds the EC2 instance to match the original t2.micro type. Just like that, it’s back to the original state. (In the case of the EC2 instance it will actually replace the instance but you get the idea).

2. Align the Code

What’s the goal? Update the Terraform code to mirror the real-time state of your infrastructure.

When to use? Ideal for when changes made outside Terraform are deemed necessary and should remain.

Here’s How it Works:

  • This is the initial EC2 setup by Terraform
  • This leads to the creation of the mentioned EC2:
  • Yet again, someone switches the instance type to t2.nano.
  • Solution? Instead of rolling back, you opt to adjust the Terraform code.
    This means altering the instance type to t2.nano in the code to match the actual setup on the AWS console.
  • Now running the plan shows ‘No changes’ since the code and the actual setup are aligned:
  • And there you have it. Both the Terraform state file and AWS console are in sync and up-to-date.


Out-of-the-box Drift Detection & Remediation

To summarize, detecting Terraform drifts is vital for maintaining the security and efficiency of your cloud environments. It’s essential to take action when drift is identified, understand its causes, and be aware of the problems it might introduce. By using effective methods to detect and rectify drift, you can ensure your system stays in its desired state, and prevent incidents, while ensuring operational excellence.

ControlMonkey is a platform that enhances your Terraform operations and provides a Drift detection and remediation mechanism out of the box.
ControlMonkey’s Drift Detection consistently compares the current state of your infrastructure with the desired state, promptly notifying you of any discrepancies encountered.
Not only does ControlMonkey excel at pinpointing disparities, but it also offers effective means of rectifying drift.
It presents a user-friendly dashboard for managing detected drifts, issues timely notifications for new deviations, and offers a convenient one-click option for addressing and remediating any drifts.

ControlMonkey’s Drift Center
Recommended from Control Monkey
4 min read
ControlMonkey Top 10 Features
Adopt a Proactive DevOps Strategy and prevent 90% of Production Issues with ControlMonkey's solutions for Terraform Operations....
Aharon Twizer
Aharon Twizer

CEO & Co-Founder

Aharon Twizer
Aharon Twizer

CEO & Co-Founder

7 min read
The Unsung Hero of Infrastructure Management: Version Control
Learn why version control system is an essential part of the infrastructure delivery revolution, just like it was for the...
Patrick Pushur
Patrick Pushur

Guest Blogger

Patrick Pushur

Guest Blogger

6 min read
Leveraging AWS CloudTrail to fight ClickOps
Amending cloud resources directly through the AWS console, commonly referred to as "ClickOps," is not the best practice. Learn how...
Aharon Twizer
Aharon Twizer

CEO & Co-Founder

Aharon Twizer
Aharon Twizer

CEO & Co-Founder

4 min read
Infra-as-Code: Critical Aspect for Your Disaster Recovery Plan
Learn why Infrastructure as Code should be a key component of your disaster recovery plan....
Aharon Twizer
Aharon Twizer

CEO & Co-founder

Aharon Twizer
Aharon Twizer

CEO & Co-founder

Compliant AWS environments in minutes, with Self-service Infrastructure
Learn how to enable other teams such as Dev and QA to launch pre-defined compliant AWS environments in minutes, by using Terraform.

Contact us

We look forward to hearing from you

AWS Governance & DevOps Productivity with Terraform

Learn how how to shift-left cloud governance with Terraform in this webinar brought to you by AWS and ControlMonkey.

We look forward to hearing from you!

ControlMonkey

Terraform Best Practices with ControlMonkey Webinar

Check out our latest webinar with DoIT International.

In this webinar we showcase together with DoIT how ControlMonkey is helping DevOps teams to make the transition from ClickOps to GitOps easily with Terraform.

This website uses cookies. We use cookies to ensure that we give you the best experience on our website. Privacy policy