A problem with Amazon’s cloud computing service disrupted internet use around the world Monday, taking down a broad range of online services, including social media, gaming, food delivery, streaming and financial platforms.
The disruption and the ensuing exasperation it caused served as the latest reminder that 21st century society is increasingly dependent on just a handful of companies for much of its internet technology, which seems to work reliably until it suddenly breaks down.
About three hours after the outage began, Amazon Web Services said it was starting to recover, although problems lingered for some users. AWS provides behind-the-scenes cloud computing infrastructure to some of the world’s biggest organizations. Its customers include government departments, universities and businesses, including The Associated Press.
Cybersecurity expert Mike Chapple said “a slow and bumpy recovery process” is “entirely normal.”
As engineers roll out fixes across the cloud computing infrastructure, the process could trigger smaller disruptions, he said.
“It’s similar to what happens after a large-scale power outage: While a city’s power is coming back online, neighborhoods may see intermittent glitches as crews finish the repairs,” said Chapple, an information technology professor at the University of Notre Dame’s Mendoza College of Business.
Amazon blames domain name system
Amazon pinned the outage on issues related to its domain name system that converts web addresses into IP addresses, which are numeric designations that identify locations on the internet. Those addresses allow websites and apps to load on internet-connected devices.
DownDetector, a website that tracks online outages, said in a Facebook post that it received over 11 million user reports of problems at more than 2,500 companies. Users reported trouble with the social media site Snapchat, the Roblox and Fortnite video games, the online broker Robinhood and the McDonald’s app, as well as Netflix, Disney+ and many other services.
The cryptocurrency exchange Coinbase and the Signal chat app both said on X that they were experiencing trouble related to the outage.
Get breaking National news
For news impacting Canada and around the world, sign up for breaking news alerts delivered directly to you when they happen.
Amazon’s own services were also affected. Users of the company’s Ring doorbell cameras and Alexa-powered smart speakers reported that they were not working, while others said they were unable to access the Amazon website or download books to their Kindle.
Many college and K-12 students were unable to submit or access their homework or course materials Monday because the AWS outage knocked out Canvas, a widely used educational platform.
“I currently can’t grade any online assignments, and my students can’t access their online materials” because of the outage’s effect on learning-management systems, said Damien P. Williams, a professor of philosophy and data science at the University of North Carolina at Charlotte.
The exact number of schools impacted was not immediately known, but Canvas says on its website it is used by 50% of college and university students in North America, including all Ivy League schools in the U.S.
Ohio State University informed its 70,000 students at all six campuses by email Monday morning that online course materials might be inaccessible due to the outage and that “students should connect with their instructors for any alternative plans.” By late afternoon, the system was still down, said spokesman Benjamin Johnson.
While AWS continued to report issues well into Monday afternoon, financial services provider Wealthsimple and Coinbase said their services had largely recovered.
“Clients may experience minor technical issues today as a result of the AWS outage,” Wealthsimple spokesperson Juanita Lee said in an email.
The Toronto Blue Jays said Monday afternoon that Ticketmaster was experiencing “ticket management issues” because of the AWS outage. While the baseball team later said the issues were being resolved, they told fans they’d have extra staff at the Rogers Centre gates that evening to help guests having trouble with tickets access the venue.
“It’s totally chaotic when people and businesses wake up to a day that they thought would be normal and they’re greeted with not being able to access systems or not being able to respond to customers,” said Tola Jimoh, founder of Cyber Strategy Consulting, a Calgary-based business.
This is not the first time issues with Amazon cloud services have caused widespread disruptions.
Many popular internet services were affected by a brief outage in 2023. AWS’s longest outage in recent history occurred in late 2021, when a wide range of companies — from airlines and auto dealerships to payment apps and video streaming services — were affected for more than five hours. Outages also happened in 2020 and 2017.
The first signs of trouble emerged at around 3:11 a.m. Eastern time, when AWS reported on its “health dashboard” that it was “investigating increased error rates and latencies for multiple AWS services in the US-EAST-1 Region.” Later, the company reported that there were “significant error rates” and that engineers were “actively working” on the problem.
Around 6 a.m. Eastern time, the company reported seeing recovery across most of the affected services and said it was seeking a “full resolution.” As of midday, AWS was still working to resolve the trouble.
Sixty-four internal AWS services were affected, the company said.
Just a few companies provide most internet infrastructure
Because much of the world now relies on three or four companies to provide the underlying infrastructure of the internet, “when there’s an issue like this, it can be really impactful” across many online services, said Patrick Burgess, a cybersecurity expert at U.K.-based BCS, The Chartered Institute for IT.
“The world now runs on the cloud,” Burgess said.
And because so much of the online world’s plumbing is underpinned by so few companies, when something goes wrong, “it’s very difficult for users to pinpoint what is happening because we don’t see Amazon, we just see Snapchat or Roblox,” Burgess said.
“The good news is that this kind of issue is usually relatively fast” to resolve, and there’s no indication that it was caused by a cyberattack, Burgess said.
“This looks like a good old-fashioned technology issue. Something’s gone wrong, and it will be fixed by Amazon,” he said.
There are “well-established processes” to deal with outages at AWS, as well as rivals Google and Microsoft, Burgess said, adding that such outages are usually over in “hours rather than days.”
Experts say our reliance on digital services and cloud computing mean incidents like Monday’s are bound to become more common.
“It’s not the first time and it won’t be the last time,” said Paul Vallée, a senior fellow at the Centre for International Governance Innovation and founder of Ottawa-based cybersecurity business Tehama.
He and others in the world of cybersecurity feel that way because cloud computing systems and other infrastructure providers are connected to so many prominent services people use and they are “inherently vulnerable” to bugs.
“I’m expecting this not to be, by any stretch, the last such problem,” Vallée said.
—With files from the Canadian Press
