How to Build Resilience for a Cloud Disaster

Can You Actually Recover from a Cloud Disaster? Block & AWS Show How

Enterprises actually bounce back from disaster in the cloud—beyond just restoring data.In May 2025, leaders from AWS, Block (Cash App), and ControlMonkey met to share lessons. They discussed how to build cloud resilience beyond just disaster recovery.
What sparked the conversation? A series of infrastructure failures showed how vulnerable even experienced cloud teams can be. This happens when they depend only on backups or use fragmented IaC.The panel unpacked what happened—and more importantly, what they did next.

What They Learned

1. Cloud DR Isn’t Just About Data

“Most teams don’t realize their DR plan is blind to configuration. They only think about restoring databases.”
—Aharon Twizer, CEO of ControlMonkey

Most disaster recovery plans focus on restoring data. However, cloud resilience needs to recover everything else. This includes IAM roles, route tables, networking, service mesh, and more.

When Block’s platform team analyzed their coverage, they found thousands of resources with no recovery path. Some had been manually provisioned years ago—outside Git, outside Terraform, and completely invisible.

2. Visibility Is the Foundation of Recovery

“The most dangerous thing in cloud is the unknown unknown.”
—Ben, Platform Engineering Tech Lead, Block

Block began their journey with a single question: “What in our cloud is actually managed by code?”

Using ControlMonkey, they were able to:

Run a 100% automated assessment of their AWS accounts
Uncover shadow resources with no IaC coverage
Build a repeatable recovery plan using daily Terraform-based snapshots

Within weeks, Block had a full DR backup of production and staging. One engineer stood up the solution.

3. Resilience Is a Continuous Lifecycle

“You don’t implement resilience. You grow it.”
—Dustin Ellis, AWS Solutions Architect

AWS highlighted its Resilience Lifecycle Framework, which encourages teams to:

Set clear RTO/RPO objectives
Design architectures with failure in mind
Run chaos game days and tabletop simulations
Continuously test, monitor, and improve

TL;DR – 5-Second Takeaways

Backups ≠ Resilience: If you can’t rebuild infra config, you’re still at risk
IaC is the recovery layer: Terraform is more than automation—it’s resilience infrastructure
ControlMonkey closes the gap: Drift detection, IaC mapping, daily config snapshots
Block got results fast: Full DR coverage in weeks, by one engineer
Don’t wait for a SEV: Build resilience before something breaks

🎥 Watch the Full Webinar On-Demand

Learn how AWS, Block, and ControlMonkey improved their disaster recovery strategy after a failure. They found that IaC visibility was the key to their success.

▶️ Watch Now

For the webinar

A 30-min meeting will save your team 1000s of hours

Book Intro Call

Author

Ori Yemini

CTO & Co-Founder

Ori Yemini is the CTO and Co-Founder of ControlMonkey. Before founding ControlMonkey, he spent five years at Spot (acquired by NetApp for $400M). Ori holds degrees from Tel Aviv and Hebrew University.

Sounds Interesting?

Request a Demo

How to Build Resilience for a Cloud Disaster

Can You Actually Recover from a Cloud Disaster? Block & AWS Show How

What They Learned

1. Cloud DR Isn’t Just About Data

2. Visibility Is the Foundation of Recovery

3. Resilience Is a Continuous Lifecycle

TL;DR – 5-Second Takeaways

🎥 Watch the Full Webinar On-Demand

A 30-min meeting will save your team 1000s of hours

A 30-min meeting will save your team 1000s of hours

Author

Sounds Interesting?

Related Resources

How to shift AWS Networking management with Terraform

Cloud Business Continuity and Disaster Recovery: Why It Matters

7 Ways to Use Terraform Data Sources for a Better Infrastructure as Code