amasol

Email
LinkedIn
Print
WhatsApp

Outages are inevitable: What you do next makes all the difference

On November 18, 2025, the internet experienced a grim reminder of how fragile our digital infrastructure can be. Cloudflare, one of the world’s largest internet infrastructure providers, suffered a widespread outage that disrupted platforms such as X (formerly Twitter), ChatGPT, Spotify, Shopify, Indeed, Zoom, Anthropic’s Claude chatbot and countless smaller websites.

Cloudflare belongs to a category of companies known as Content Delivery Networks (CDNs) and internet infrastructure providers which is essentially the invisible backbone of the modern internet. These companies operate geographically distributed networks of servers and data centers that deliver content quickly and reliably to end users around the world. They act as intermediaries between websites and visitors, caching content closer to users to reduce loading times. Other major CDN providers include Akamai, Amazon CloudFront, Fastly, and Google Cloud CDN.

What CDNs do

CDNs work by storing copies of website content such as images, scripts, static assets for apps like icons, logos, and User Interface (UI) elements, fonts, downloadable files, video and audio files, JavaScript files, and other static content on servers spread across the globe. Instead of every user connecting to the original server which can be thousands of kilometers away, CDNs route requests to the nearest server. If you’re in Munich and want to stream music on Spotify, you’ll get static assets such as images, scripts, album art from a CDN node in Frankfurt rather than Spotify’s origin infrastructure in Sweden. Which in turn makes for better user experience.

CDNs are like major grocery stores (Rewe, Edeka, Aldi, Kaufland) that keep what you want and need close to you. Instead of traveling across the world for every item, wine from Italy, fruit from Thailand, spices from India, the store stocks them nearby so you can grab them quickly. Without the grocery store, you’d have to make those long trips yourself which would take time, cost more, and cause delays.

According to Business Insider, “A Cloudflare spokesperson told Business Insider that the company first saw “unusual traffic” to one of its services at 6:20 a.m. ET, with a status update on its website around 30 minutes later saying it was experiencing “internal service degradation. The cause of the outage was a configuration file that is automatically generated to manage threat traffic.”

In simple terms, service degradation means the systems were running but not at full capacity (like a highway with two out of the three lanes closed). Cloudflare noticed this slowdown and tried to fix it by applying a configuration file, which is basically a set of instructions for how servers should handle traffic. Unfortunately, the file was incorrectly configured, so the two lanes remained closed, and traffic continued to back up.

The outage lasted for roughly three hours and not days or weeks because they quickly found the root cause. Despite a flood of alerts, they were able to pinpoint the issues thanks to world-class observability and detectability tools.

The real fix: unified observability & detectability

Your organization can achieve the same resilience with the right tools in place. amasol is a leading IT consultancy and managed service provider specialized in monitoring concepts fostering Usability, Observability, Detectability and IT-Reliability. Our clients value our expertise in selecting, implementing and operating state-of-the-art software solutions to create intuitive, high-performance and secure IT environments.

amasol is an official partner to Dynatrace, Broadcom, Exeon, CrowdStrike, Splunk, and Keysight Technologies. All of which can help during an outage by cutting through false or non-critical alerts, pinpointing the root cause, and bringing services back online.

Dynatrace: AI-Powered Observability

To quickly summarize how our technology partners’ tools can help, Dynatrace provides AI-powered observability across applications, infrastructure, and cloud services. Its Davis® AI uses causal AI to automatically correlate metrics, logs, traces, and events, performing real-time topological analysis to identify the precise root cause of an incident. During an outage when alert storms occur, Dynatrace consolidates related anomalies into a single problem and provides contextual remediation steps, cutting Mean Time to Repair (MTTR) by up to 90%. Beyond reactive troubleshooting, Davis AI predicts and prevents potential incidents, ensuring teams act on real problems, not symptoms.

Broadcom DX NetOps & DX Spectrum: Network Fault Isolation

Broadcom DX NetOps combined with DX Spectrum, delivers advanced network fault management and root cause analysis across complex, multi-vendor environments. These solutions automatically model network topology and apply intelligent event correlation to suppress symptomatic alarms, pinpointing the exact component, whether a device, link or configuration error, responsible for service degradation or outages. By providing a single source of truth, DX Spectrum eliminates finger-pointing between teams and accelerate fault isolation, reducing mean time to repair (MTTR). We don’t know what tools Cloudflare uses to detect and pinpoint the root cause in a configuration file, but you can achieve the same with Broadcom DX NetOps.

Exeon: Network Detection & Response

Exeon.NDR delivers advanced Network Detection & Response by leveraging AI and machine learning to analyze network metadata, not raw packets, for maximum efficiency and privacy. It detects anomalies, lateral movement, and hidden threats, even in encrypted traffic.

During an outage, uncertainty can slow recovery. Exeon helps confirm whether the disruption is purely technical or compounded by a cyberattack. Its risk-based alerting and behavioral analytics minimize false positives and enable rapid triage, ensuring security teams focus on real threats instead of chasing ghosts.

CrowdStrike Falcon: Endpoint Protection

CrowdStrike Falcon is a leading cloud-native endpoint protection platform that combines Next-Gen Antivirus (NGAV), Endpoint Detection and Response (EDR), and integrated Threat Intelligence. It continuously monitors endpoint activity, detects suspicious behavior, and enables real-time containment and remediation. If endpoints are compromised during an outage, Falcon provides full visibility into attack chains, prioritizes incidents, and allows instant isolation of infected devices which will prevent further disruption and accelerating recovery.

Splunk: cutting through alert storms

Splunk delivers unmatched visibility, intelligence, and automation to help you detect, investigate, and respond to threats quickly through core capabilities such as Security Information and Event Management (SIEM) which centralizes and correlate security data for faster detection and response. Advanced Threat Detection which utilizes machine learning and behavioral analytics to uncover hidden threats. This is extremely useful during an outage because Splunk can cut through the alert storms and correlate events across logs, metrics, and traces.

Keysight Technologies: proactive network monitoring

Keysight Technologies delivers advanced testing, visualization, and security solutions to ensure application performance is optimal across physical and virtual networks. Its Hawkeye platform provides active network monitoring and synthetic testing, simulating real-world traffic to validate performance and continuously monitor QoS and QoE across your IT environment. During an outage, Hawkeye enables proactive detection and rapid troubleshooting of network bottlenecks, latency, and connectivity failures which will help IT teams isolate problems and restore service as quickly as possible.

amasol as your strategic partner

These tools are powerful, but they need an expert to make them work together seamlessly or to tell you only what you truly need without paying extra for tools your organization doesn’t require. We understand budgets are tight and every technology investment must deliver measurable value which is why amasol is very keen on BizOps which stands for Business Operations. BizOps is a mindset that connects IT performance directly to business outcomes. amasol believes in:

• Aligning technology decisions with measurable business results.
• Bridging the gap between IT operations and financial KPIs.
• Confirming that all IT investments deliver value.

amasol is a vendor-neutral managed service provider. That means we’re not tied to a single product or platform. Instead, we evaluate your specific challenges and recommend the most effective solution from our ecosystem of trusted partners. Unlike individual vendors who naturally advocate for their own tools, amasol provides objective guidance based on what will actually solve your problem.

With amasol as your strategic partner, your organization has a proactive strategy to prevent outages and if they happen, quickly resolve them within hours and not days or weeks. We help you reduce downtime, keep your IT environment secure and ensure it performs at the most optimal level. More than technology, we deliver clarity, resilience, and confidence in a complex digital world where one misconfiguration file can shut down major platforms as we saw with Cloudflare. Their outages lasted only three hours because they had the right observability and detectability tools. If your organization experiences recurring outages or you simply want to become proactive, contact us today for an initial non-obligatory first contact.

From Observability to Sustainability and Green IT

Dynatrace & amasol: Stronger together

85% of technology leaders say the number of tools, platforms, dashboards, and applications adds to the complexity of managing a multicloud environment. amasol simplifies IT operations, enhances performance, and drives seamless business continuity with our unified observability solutions.

Dynatrace & amasol: Stronger together

Dynatrace provides valuable insights into your IT processes. amasol connects the dots between your business requirements and IT processes.

Successful registration to our Exeon Workbench

Good day,

thank you for registering for the Workbench | Threat detection with AI-based behaviour analysis.

Here is the most important information:

When: Tuesday, 30th of September 2025 | 10 a.m. – 11 a.m.
Where: Online via Zoom.

We look forward to your participation and to interesting discussions and presentations on the topic of Detectability.

Kind regards
Laura Ilgner

You will receive a reminder email from us one week before the event.

Successful registration to the DX NetOps Usergroup in Vienna

Good day,

thank you for registering for the DX NetOps User Group from amasol.

Here is the most important information:

When: Thursday, 9 October 2025 | 9:45 a.m. – 5:00 p.m.
Where: MEZZANIN Meetings & Events by Zeitgeist Vienna near Vienna Central Station
Here you will find information on the location and how to get there.

We look forward to your participation and to interesting discussions and presentations on the topic of Broadcom.

Kind regards
Laura Ilgner

You will receive a reminder email from us one week before the event.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.