
Amazon Web Services (AWS), the world’s largest cloud computing provider, has restored normal operations after a major global outage disrupted thousands of websites, apps, and business operations worldwide. The outage, which began on Monday, impacted popular platforms such as Snapchat, Reddit, Venmo, Zoom, Coinbase, Robinhood, Fortnite, and Uber, highlighting the critical dependence of modern digital infrastructure on cloud services.
AWS Outage Hits Businesses Worldwide
The disruption originated at AWS’s US-EAST-1 data center, its oldest and largest site, which has previously been associated with outages in 2020 and 2021. Known for hosting countless cloud applications, this cluster services millions of users globally.
Amazon reported that while the bulk of services returned to normal by Monday afternoon, some platforms—including AWS Config, Redshift, and Connect—experienced a backlog of messages, which required additional processing hours.
“This incident underscores how dependent modern businesses are on a handful of cloud providers,” said Jake Moore, global cybersecurity advisor at ESET.
Workers and customers from London to Tokyo faced disruptions in day-to-day activities, from paying bills and ordering services to accessing digital platforms for work and leisure.
Root Cause: Network Health Monitoring Subsystem
AWS identified the root cause as a failure in a network health monitoring subsystem responsible for balancing traffic across servers in its Elastic Compute Cloud (EC2) internal network. The issue impacted the Domain Name System (DNS) for AWS’s DynamoDB API, preventing applications from locating critical cloud-based data.
Ken Birman, a Cornell University computer science professor, emphasized the need for companies to adopt fault-tolerant systems and backups across multiple cloud providers to mitigate risks.
“When companies cut corners to save costs and fail to implement redundancies, outages like this hit them hardest,” Birman explained.
Widespread Impact on Digital Platforms
According to Downdetector, over 4 million users reported disruptions in the UK alone. Financial institutions like Lloyds Bank and Bank of Scotland, telecom providers including Vodafone and BT, and government services such as HMRC were affected.
Globally, the outage impacted a wide range of sectors, including:
- Social media and communication apps: Snapchat, Reddit, X, Signal, Zoom
- Gaming platforms: Fortnite, Roblox, Clash Royale, Clash of Clans
- Financial services: Venmo, Coinbase, Robinhood
- Transportation apps: Uber, Lyft
Even Amazon’s own platforms, including its shopping website, Prime Video, and Alexa, experienced temporary disruptions.
Fragility of Global Cloud Infrastructure
Experts say the incident demonstrates the vulnerability of the interconnected digital economy, where a single cloud provider outage can cascade across multiple sectors.
“The main reason for this issue is the heavy reliance of major companies on a single cloud service,” said Nishanth Sastry, director of research at the University of Surrey.
The outage serves as a reminder of the importance of redundancy, disaster recovery, and multi-cloud strategies for businesses that depend on continuous uptime.
AWS Continues to Lead the Cloud Market
Despite the disruption, Amazon remains the world leader in cloud computing, followed by Microsoft Azure and Google Cloud. Analysts note that while outages are rare, US-EAST-1 continues to be a focal point of service interruptions due to its status as the default region for many AWS services.
“For major businesses, hours of downtime translate into millions in lost productivity and revenue,” said Ryan Griffin, U.S. cyber practice leader at McGill and Partners.
Looking Ahead
Amazon’s resolution of the outage demonstrates both the resilience and limitations of centralized cloud infrastructure. Companies across the globe are likely to re-evaluate cloud dependency, backup protocols, and multi-region strategies to prevent similar disruptions in the future.
AWS has stated that all systems have returned to normal, but organizations relying on delayed message processing are advised to monitor operations for the next several hours.
Leave a Reply