6 min read

Leveraging AWS CloudTrail to fight ClickOps

Aharon Twizer
Aharon Twizer

CEO & Co-Founder

Aharon Twizer
Aharon Twizer

CEO & Co-Founder

Introduction

Remember the last time your colleague modified a resource directly in the AWS console, which led to a production issue?
Instances like these highlight why amending cloud resources directly through the AWS console, commonly called “ClickOps,” is not the best practice.
If you haven’t encountered such an issue yet, that’s great! But remember, ClickOps can be likened to riding a motorcycle—you either have had an accident or will have one in the future.

The Pitfalls of Direct AWS Console Access

Working directly from the AWS console can often bypass organizational policies and security controls, leading to significant risks.
Operations performed directly through the console may not undergo the same review and validation processes as those executed through Infrastructure-as-Code, potentially leading to misconfigurations and security breaches.
This approach also compromises your audit and compliance readiness. To maintain a well-managed Software Development Lifecycle (SDLC), it is crucial to avoid allowing engineers to change resources via ClickOps.
While using direct console access with read-only permissions may still be viable, it is generally not recommended for creating, updating, or deleting files.

Promoting Best Practices: The Shift to GitOps

To mitigate these risks, organizations are increasingly adopting a GitOps methodology. GitOps uses Git repositories as the source of truth for defining and modifying infrastructure, ensuring that all changes are reviewed, version-controlled, and auditable.
By restricting AWS console permissions—ideally, to read-only access—organizations can ensure that all modifications undergo the proper review process, thereby maintaining security and compliance.

Reflect on how software deployment has evolved over the past 20 years. Would you permit software updates to be executed directly from an R&D team’s local machine today? Probably not. The same principles should apply to your infrastructure delivery as you have established processes and automation for software deployment.

Don’t misunderstand; removing AWS credentials from multiple engineers across an organization is not trivial. However, this shift is critical, especially when managing cloud infrastructure at scale across multiple accounts and even multiple cloud providers.

Detecting Unauthorized Operations: Leveraging CloudTrail Logs

Now, let’s discuss how to track down ClickOps activity—identifying who’s using the AWS console and what changes they’re making directly through it. The critical tool for this task is AWS CloudTrail.

AWS CloudTrail is a powerful tool for governance, compliance, and operational and risk auditing of your AWS account. It captures every API call to AWS APIs, including those from the management console, SDKs, command line tools, and other AWS management interfaces.

Interpreting CloudTrail data is not straightforward; it requires sifting through and parsing JSON files to extract critical information such as the user or role who made the operation, what resource was amended, and more.

Example of how a CloudTrail event JSON might look:

Here’s how you can leverage CloudTrail logs to safeguard your environment:

  1. Parse and Analyze Events: Start by analyzing CloudTrail events to extract insights. Each AWS service may report slightly different fields to CloudTrail, so normalizing this data for later querying is essential.
    You’ll need to extract:
    • User Identity: Identify the IAM user/role involved in the operations. This might include handling edge cases like extracting session names with the user’s email for SSO roles rather than the role itself.
      Usually, you can use the userIdentity field for that. Still, there are all kinds of edge cases where this field might be empty, and then you’ll need a particular behavior for specific CloudTrail-logged events.
    • Action Taken: Determine what action was performed.
    • Resource Amended: Identify the resource(s) involved. In cases where multiple resources are affected by a single action, establish a logic to determine the primary resource.
    • Operation Source: Determine where the action originated—was it from the AWS console, an infra-as-code tool, or an SDK call? This can be identified using fields like sessionCredentialFromConsole and analyzing the userAgent.
  2. Group Events by Resource: Group the events by resources to identify which are frequently modified directly from the console. These are prime candidates for management via Terraform as part of your Infrastructure as Code (IaC) strategy.
  3. Group Events by User: This helps identify which teams or individuals need more training on GitOps and Terraform practices.
  4. Set Up a Dashboard: Create a dashboard to filter and query events by user, resource, action, and time range. This tool will aid in tracking and investigating production issues.
  5. Set Up Alerts: With the data organized, alerts for specific resources or operation thresholds are set up. For instance, alert if your Production RDS is manually altered from the console or if there are over 20 ClickOps operations within 24 hours. Connect these alerts to your alerting system or a Slack/Teams channel. Ensure there is a straightforward procedure for handling these alerts to effectively reduce ClickOps activity in your account.
  6. Generate Monthly Reports: Use the collected data to generate monthly reports to share with your team or management about the success of your GitOps strategy and identify areas needing additional training on infra-as-code and GitOps practices. These reports can also support the case for removing console permissions, mainly when an account is fully managed by infra-as-code.

Reaching the Oasis: Removing Direct AWS Console Access

After implementing the previous steps, you now have:

  • An easy-to-use dashboard to visualize ClickOps activity.
  • Alerts for any abnormal ClickOps activities.
  • Monthly reports showcasing general trends of your GitOps versus ClickOps migration.

This setup marks the appropriate time to begin removing console access permissions. This process isn’t a one-day affair; it needs to be conducted carefully to ensure that it doesn’t disrupt your teams’ day-to-day operations.

Once your dashboard and reports indicate that specific areas of your cloud environment are no longer being managed through the AWS console, removing “write” permissions from console users is your cue. However, it’s prudent to maintain a “break-glass” user –  reserved strictly for emergency use, ensuring you can still manage critical situations effectively.

Conclusion

Moving away from direct AWS console operations and adopting a GitOps methodology enhances your security posture and aligns with best practices for cloud governance. By leveraging tools like AWS CloudTrail and implementing strict access controls, organizations can detect unauthorized operations and ensure their cloud environments are secure, compliant, and optimized for operational excellence.

About ControlMonkey

ControlMonkey is the most comprehensive Terraform Automation Platform, providing users with a 360 solution to manage the cloud at scale with Terraform.
You get a single control plane with a complete cloud inventory and alerts on ClickOps activity. It also offers Terraform code generation for your existing cloud environments, as well as drift detection and remediation.

With ControlMonkey, you can standardize your infrastructure delivery at scale with out-of-the-box GitOps Terraform CI/CD, incorporating cost, security, and compliance policies, plus a self-service catalog of pre-defined, compliant infrastructure blueprints for other teams in the organization to spin up infrastructure, enabling agility without sacrificing control.

With ControlMonkey, you can be confident that everything running in your cloud is correctly configured and is supposed to be there.
Book a 1:1 consultation session with our Terraform Experts to learn more about our Terraform Automation platform.

Recommended from Control Monkey
4 min read
ControlMonkey Top 10 Features
Adopt a Proactive DevOps Strategy and prevent 90% of Production Issues with ControlMonkey's solutions for Terraform Operations....
Aharon Twizer
Aharon Twizer

CEO & Co-Founder

Aharon Twizer
Aharon Twizer

CEO & Co-Founder

4 min read
Infra-as-Code: Critical Aspect for Your Disaster Recovery Plan
Learn why Infrastructure as Code should be a key component of your disaster recovery plan....
Aharon Twizer
Aharon Twizer

CEO & Co-founder

Aharon Twizer
Aharon Twizer

CEO & Co-founder

5 min read
Running Terraform with Jenkins: Pros and Cons
In this blog, we will dive deep into the pros and cons of running your Terraform automation with Jenkins. ...
Ori Yemini
Ori Yemini

CTO & Co-Founder

Ori Yemini
Ori Yemini

CTO & Co-Founder

1 min read
AWS Blog: How to Import and Manage AWS Networking with Terraform and ControlMonkey
Check out AWS's latest Blog about ControlMonkey and Terraform....
Aharon Twizer
Aharon Twizer

CEO & Co-founder

Aharon Twizer
Aharon Twizer

CEO & Co-founder

Compliant AWS environments in minutes, with Self-service Infrastructure
Learn how to enable other teams such as Dev and QA to launch pre-defined compliant AWS environments in minutes, by using Terraform.

Contact us

We look forward to hearing from you

AWS Governance & DevOps Productivity with Terraform

Learn how how to shift-left cloud governance with Terraform in this webinar brought to you by AWS and ControlMonkey.

We look forward to hearing from you!

ControlMonkey

Terraform Best Practices with ControlMonkey Webinar

Check out our latest webinar with DoIT International.

In this webinar we showcase together with DoIT how ControlMonkey is helping DevOps teams to make the transition from ClickOps to GitOps easily with Terraform.

This website uses cookies. We use cookies to ensure that we give you the best experience on our website. Privacy policy