Sitemap

Lessons in Cloud Resilience: Unpacking June 12, 2025, GCP Outage

5 min readJun 18, 2025
Press enter or click to view image in full size

On June 12, 2025, Google Cloud Platform (GCP) experienced a widespread service disruption that echoed far beyond its own infrastructure. What initially appeared to be a GCP-only incident ultimately exposed the fragility of our interdependent digital ecosystem. This wasn’t just a cloud outage, it was a wake-up call to re-examine what true resilience means in the era of complex, distributed systems.

In this article, we explore:

  • What happened and when?
  • Why services beyond Google were impacted.
  • Root causes and recovery details from Google’s RCA
  • Insights from cloud and security professionals
  • Real-world examples for mitigation and practical recommendations
  • A comprehensive resilience checklist to apply now.

🛠️ What Happened?

The outage originated within GCP’s Service Control, an internal service that manages API quota enforcement and access control for all Google Cloud products. A misconfigured quota policy — deployed without proper validation — contained a null field. This triggered a null pointer exception that caused the entire Service Control service to crash globally.

--

--

Amrik Hanjra
Amrik Hanjra

Written by Amrik Hanjra

Ex-BlackRock Technologist. I secure digital systems in Fintech, turning complex challenges into breakthroughs. Student of Life & Tech.

Responses (1)