Outages
"An engineer was using a role with broader permissions than expected—a user access control issue, not an AI autonomy issue."
AWS says it has “implemented numerous safeguards” to avoid outages or disruptions – after an incident that sources told the FT was due to AI tool use.
Amazon’s comment on Friday came after the FT, citing four Amazon employees, claimed that a December incident was caused by Amazon’s AI coding bot Kiro pushing code to production that caused an incident.
AWS denied that characterisation. A spokesperson told The Stack:
Kiro puts developers in control—users need to configure which actions Kiro can take, and by default, Kiro requests authorization before taking any action. In this case, an engineer was using a role with broader permissions than expected—a user access control issue, not an AI autonomy issue.
The hyperscaler played down the scale of the December event – saying it was “extremely limited”, did not trigger “any customer inquiries regarding the interruption” and was limited to a single service: AWS Cost Explorer.”
But it added that it has moved to ensure “mandatory peer review for production access,” – without specifying who is a “peer” and whether it meant production access for an agent, in a 252-word response to the FT.
A senior Amazon employee speaking to the FT said there had been two outages in the last few months that were related to AI bot errors.
They told the newspaper: “The engineers let the AI [agent] resolve an issue without intervention. The outages were small but entirely foreseeable.”
Amazon said: “The Financial Times' claim that a second event impacted AWS is entirely false.”
The incident comes after Alphabet’s CFO Anat Ashkenazi told investors earlier this month that “about 50% of our codes [sic] are written by agents” and “we're deploying agents [to] pay and reconcile invoices.”
Amazon added: “For more than two decades, Amazon has achieved high operational excellence with our Correction of Error (COE) process.
It added: “We review these together so that we can learn from any incident (just like this one), irrespective of customer impact, to address issues before their potential impact grows larger.”
Approximately just one per year makes it to a full public post-mortem however, its incident response page shows.
Interviews, insight, intelligence, and exclusive events for digital leaders.
No spam. Unsubscribe anytime.