amasol

Email
LinkedIn
Print
WhatsApp

Amazon Service was taken down by AI coding bot explained

The biggest news in IT hasn’t made that big of a wave in the general media due to the fact that it’s quite technical and if we can be honest, boring to people who are not passionate about the subject. We think its a big deal and we want to use simple language to explain what it is and why everyone should care.

According to the Financial Times, “Amazon Web Services (AWS) experienced a 13-hour interruption to one system used by its customers in mid-December after engineers allowed its Kiro AI coding tool to make certain changes. The people said the agentic tool, which can take autonomous actions on behalf of users, determined that the best course of action was to “delete and recreate the environment”.

Of course, Amazon denied the report and released their own statement, which heavily emphasized, “This was user error, not AI error.”

To be fair, access control failures are a known and documented risk in software engineering. This exact scenario has played out with human developers for decades. Amazon isn’t wrong to point that out.

At the same time, Amazon has commercial incentives tied to AI products. That doesn’t mean their statement is fully true or fully false. We believe the subject at hand requires more nuances. When a system is powerful enough to autonomously delete and recreate an environment, the distinction between user error and AI error becomes less meaningful because a more important question should be asked,

How much autonomy should agentic AI have in production environments without human intervention?

Whether the engineer granted too much access or the AI exercised the access it was given, 13 hours of downtime is 13 hours of downtime. Blame is less important than governance. According to the same article, a senior AWS employee said, “We’ve seen at least two production outages. The engineers let the AI agent resolve an issue without intervention.”

This specific quote highlights the skepticism some people have simply because the technology is too new and there’s not a lot of data to fully understand it. amasol stands firmly in the belief that this technology is the future with the caveat that until we have enough data to prove it can have full range of access, we believe in humanity as the final touchpoint before any product made or fixed by agentic AI goes live.

So what exactly is Agentic AI?

Traditional AI tools are reactive. You ask a question, they give an answer. Think ChatGPT. The AI is simply a very capable assistant that waits to be told what to do.

Agentic AI is a different animal entirely. Instead of responding to a single input, an agentic system is given a goal and then autonomously determines the sequence of actions needed to reach it. It reasons through the problems, selects tools it thinks will achieve the goal, executes steps, evaluates results, and decides what to do next.

Under the hood, this typically involves a Large Language Model (LLM), which is a type of artificial intelligence trained on vast amount of text to understand and generate human-like language acting as the brain. This is then connected to a set of tools such as Application Programming Interfaces (APIs) which are sets of rules that allow different software applications to talk to each other, code executors, file systems, browsers, and databases. The model will do this until the goal is achieved or until something goes wrong.

A traditional AI interaction lasts one turn. An agentic interaction can last hundreds of steps, touching dozens of systems, making dozens of decisions. Without oversight provided, it will do it without letting the humans know about it.

An example to get everyone on the same page, think about agentic AI as an overzealous travel agent and you have given this agent your passport and credit card information with the sole goal of an Island vacation. Instead of telling you the flight has been cancelled and ask what to do next, the travel agent (agentic AI) executes a series of autonomous decisions. It cancels the entire $20,000 trip and rebooks you to an island nearby.

The AI succeeded in the literal term. It got you to an island. What it didn’t know, because you didn’t think to tell it that your friends had booked separate flights and hotels to meet you at the original destination. Without context and guardrails, the AI will do what it needs to do to satisfy the literal requirements of its goal. In the case of Amazon, the AI decided that the most efficient path to a clean environment was to delete and start over. It solved the technical problem while completely ignoring the human consequences.

Great tech. New territory.

We cannot stress enough how new agentic AI really is. We are in the earliest chapters of understanding how these systems behave at scale, in real production environments, with real consequences. The frameworks and best practices that exist for traditional software development such as change management, access controls, and rollback procedures have not fully caught up yet to what agentic tools are capable of doing. Since you didn’t provide proper guardrails, the agent will do what it must to achieve its goal.

We shouldn’t stop advancing forward out of fear of new technology. The productivity gains are real. The competitive pressure is real. Organizations that can implement agentic AI successfully will move faster, have a leaner operation, and outpace those who don’t. However, the success isn’t in setting up agentic AI and letting it do everything, it’s about being the ones who apply the necessary human guardrails so the AI knows the direction and the boundaries.

Forced adoption is making this worse

In the same article, there’s a detail that we would like to point out, “The company had set a target for 80% of developers to use AI for coding tasks at least once a week and was closely tracking adoption.” When adoption becomes a metric, caution becomes friction.

Imagine you are one of the skeptical engineers of agentic AI, it’s 4:00 p.m. on a Friday, you haven’t touched it once this week. In a rush to hit your internal Key Performance Indicator (KPI) and avoid a Monday morning meeting about your lack of compliance, you give Kiro the full freedom to debug the latest patch you wrote without setting a final manual review and head off to enjoy your weekend. In this example, the AI could do exactly what happened with Amazon, the AI saw the most efficient path toward a clean patch is to delete and start over. The Monday meeting is now about an outage not the missed KPI.

The real risk is unmonitored AI

Think of agentic AI tools as fully autonomous self driving car. You give it the goal of getting you to work in 10 minutes. Google Maps says it will take 20 minutes. If you simply give the agentic AI the goal without the guardrails, it will determine that the most efficient path to achieving your 10 minutes goal is to drive over the sidewalk, through parks, across private property. It’s not going rogue, it is simply fulfilling its objective without the context of safety, law, or common sense. Agentic AI is a literalist.

What good governance actually looks like

We sit firmly in the belief that agentic AI is the future. Future or not, we believe that until there is enough data, enough track record, and enough maturity in these systems to justify full autonomy, a human being should be the final touchpoint before anything is built or fixed by an AI agent. Trust is earned through evidence and we’re still in the evidence gathering phase.

Here is why we believe it’s an urgent problem. We barely have adequate governance over standard AI tools that wait for a human to type something. We’re now talking about agentic AI that can access systems, make decisions, and execute actions before you open your laptop. If we don’t even have a proper Generative AI governance yet, companies should not give their agentic AI the permissions to write or delete anything without human approval.

Good governance isn’t a locked door. It’s well placed checkpoints. Let’s look at one of our technology partners, CrowdStrike. Their Falcon Data Protection is built around the current work environment of prompting or chatting with AI chatbots such as ChatGPT, Gemini, Claude, or CoPilot. Crowdstrike will block any sensitive data, credentials, and regulated information before it reaches those chatbots.

This is proven in practice. One of CrowdStrike’s own customers, the Aldo Group, shared that an employee attempted to upload a sensitive file into ChatGPT. Falcon caught it so there was no breach, no incident report, and no hard to have conversations with the employee who simply made a mistake.

This is the perfect example of a guardrail because everyone from interns to senior executives are using chatbots more than ever. At the end of the day, humans make mistakes. Instead of punishing an employee for a lapse in judgement or new employees from not knowing the protocol, we believe in providing the technology that stops the error before it’s made. Governance should be the invisible safety net that allows your team to innovate without the fear of a single click becoming a corporate catastrophe.

Agentic AI is arriving fast across our partners’ products. The tools are powerful but so are the risks. As your Managed Service Provider, amasol designs and runs the human guardrails needed from mandatory human approval for high impact changes to audit and rollback to Generative AI data loss protection. Ship with agents, launch with human sign off.

From Observability to Sustainability and Green IT

Dynatrace & amasol: Stronger together

85% of technology leaders say the number of tools, platforms, dashboards, and applications adds to the complexity of managing a multicloud environment. amasol simplifies IT operations, enhances performance, and drives seamless business continuity with our unified observability solutions.

Dynatrace & amasol: Stronger together

Dynatrace provides valuable insights into your IT processes. amasol connects the dots between your business requirements and IT processes.

Successful registration to our Exeon Workbench

Good day,

thank you for registering for the Workbench | Threat detection with AI-based behaviour analysis.

Here is the most important information:

When: Tuesday, 30th of September 2025 | 10 a.m. – 11 a.m.
Where: Online via Zoom.

We look forward to your participation and to interesting discussions and presentations on the topic of Detectability.

Kind regards
Laura Ilgner

You will receive a reminder email from us one week before the event.

Successful registration to the DX NetOps Usergroup in Vienna

Good day,

thank you for registering for the DX NetOps User Group from amasol.

Here is the most important information:

When: Thursday, 9 October 2025 | 9:45 a.m. – 5:00 p.m.
Where: MEZZANIN Meetings & Events by Zeitgeist Vienna near Vienna Central Station
Here you will find information on the location and how to get there.

We look forward to your participation and to interesting discussions and presentations on the topic of Broadcom.

Kind regards
Laura Ilgner

You will receive a reminder email from us one week before the event.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.