Amazon Web Services said the outage was due to an “underlying DNS issue”, referring to the Domain Name System – effectively the global online directory that automatically translates domain names into IP addresses.
By 6.35am Eastern time (11.35pm NZT), the company said the issue had been “fully mitigated”, with most operations resuming. But AWS said later that some of its services are “still experiencing elevated errors”.
The outages at AWS, a leading provider of cloud infrastructure, illustrate the large number of websites and companies that are dependent on the tech giant for computing and storage resources.
It also signals the fragility of the interconnected-internet, where an error with one company can trigger a cascading impact across the web. (Amazon founder Jeff Bezos owns The Washington Post.)
According to Downdetector, users reported issues on Monday night with communication apps Snapchat, WhatsApp, Signal and Zoom; gaming services such as Roblox, Fortnite and Xbox; as well as other sites, including Google and YouTube. It was not immediately clear whether or how the outages were connected.
The language app Duolingo, creative tool Canva and exercise app Strava were among those reporting errors on their websites.
In an earlier update, the photo-sharing platform Flickr said it was “temporarily unavailable due to a major issue affecting Amazon Web Services”, while Peloton, the touch-screen stationary bike company, also said it had been impacted.
In response to a request for comment, Amazon Web Services referred to its health dashboard, which by 8am Eastern time showed that engineers had downgraded the severity of the impact from “degraded” to “impacted”.
Amazon had earlier warned that while its engineers had fully resolved the underlying issue, some users would still encounter error messages as a result of the enormous backlog of requests.
“What we’re seeing this morning is a major outage at AWS, centred on the US-EAST-1 region in Virginia, which has spread out across the globe, impacting a lot of different websites, apps and services that we rely on,” said Oli Buckley, a professor of cybersecurity at England’s Loughborough University.
He said an issue involving DNS “hits thousands of systems that rely on it, and they can’t find the right server. Ultimately this means that they slow down as they try to locate it, and eventually just stop trying”.
“This outage is extremely significant because AWS is a backbone for much of the internet’s infrastructure,” Buckley added. “It’s a really stark reminder of how many well-known services rely on a handful of providers and services themselves.”
With reporting by Herald staff.
Sign up to Herald Premium Editor’s Picks, delivered straight to your inbox every Friday. Editor-in-Chief Murray Kirkness picks the week’s best features, interviews and investigations. Sign up for Herald Premium here.