The true cost of disruptions and why monitoring AI dependencies is crucial
In the digital age, where both businesses and consumers thrive on seamless connectivity and uninterrupted service, recent major outages have raised alarms. From ChatGPT blackouts to other tech giants struggling with unexpected downtime, the financial impact of these disruptions can be enormous and extend beyond just monetary losses. According to Dun & Bradstreet, 59% of Fortune 500 companies experience at least 1.6 hours of downtime per week, which equates to weekly costs ranging from $643,200 to $1,056,000.
Companies have also seen their reputations take a hit as a result of these precious moments. Beyond the immediate losses lies a new concern: How can companies effectively protect themselves from the major consequences of future disruptions? Downtime, the period during which systems are inaccessible or do not function optimally, severely disrupts users’ access to online services, halts employee productivity, and/or hinders customer engagement with an organization.
Because the Internet is a complex web of interconnected systems, networks and applications, these disruptions can quickly escalate, significantly damaging an organization’s reputation. The statistics paint a bleak picture. Forrester’s 2023 Opportunity Snapshot shows that:
1/37% estimated that their businesses lost between $100,000 and $499,000, and 39% lost $500,000 to $999,999 due to internet outages.
2/ Disruptions also damage companies internally by increasing employee turnover (55%) and reducing workforce productivity (49%).
3/ Without sufficient visibility, companies experience an average of 76 disruptions per month.
4/75% of respondents said IPM would have a significant or major positive impact on their business.
The US AI market is estimated to be worth between $87.18 billion and $167.3 billion, and its growth is causing the digital landscape to evolve at breakneck speed. The growing dependence on AI-driven applications puts a spotlight on the need for proactive monitoring against downtime. The February 14 ChatGPT outage affected both the ChatGPT service and customers running GPT-based chatbots via an API. Monitoring AI dependencies will be critical for all businesses, from startups to enterprises.
Co-founder and CEO of Catchpoint.
An example
In December 2023, Adobe’s extensive customer base was hit by a series of Adobe Experience Cloud outages that lasted 18 hours. While AI has not yet been added to their platform, many companies are beginning to rely more and more on the technology, and this outage serves as an example of what could happen once it becomes more deeply embedded. Overall, the Adobe Experience Cloud outage highlights the vulnerabilities inherent in relying on third-party services within digital infrastructure. This disruption, caused by a failure in Adobe’s cloud infrastructure, resulted in significant service outages, impacting critical functions across multiple platforms.
According to Adobe, data collection (Segment Publishing), data processing (Cross-Device Analytics, Analytics Data Processing), and reporting applications (Analysis Workspace, Legacy Report Builder, Data Connectors, Data Feeds, Data Warehouse, Web Services API) were all affected by the outage. During the outage, users experienced slowdowns and slow performance across Adobe services. A post-mortem revealed that the root cause of the disruption stemmed from issues within Adobe’s cloud infrastructure, leading to latency spikes and longer load times for users.
The failure within Adobe’s infrastructure had far-reaching consequences, impacting businesses and users who depended on Adobe services for their daily operations. Additionally, Adobe risked Service Level Agreement (SLA) violations for millions of customers. An SLA sets a clear time frame within which tickets must be answered or chats and calls must be answered. If they are not answered or collected within the specified timeframe, an SLA violation will occur. Payouts often follow. Customer loyalty can also be tested.
The Adobe outage was more than a disruption: it served as a wake-up call for companies using their services to reevaluate their broader approach to digital resilience. The scale of the outage, which impacts so many Adobe services, serves as a valuable reminder of the need for businesses to always make contingency plans and take proactive measures to protect against future disruptions.
So how can companies better manage the risks and create a robust path to internet resilience? This fundamentally requires a massive change, prioritizing real-time visibility into app performance, which can identify potential bottlenecks or other pain points before they turn into a full-blown crisis. By monitoring AI (or other) dependencies with laser-like precision, organizations can preemptively address vulnerabilities, strengthen their digital infrastructure, and limit the impact of unforeseen disruptions.
Downtime protection in the age of AI
There is no denying that in today’s fiercely competitive landscape, even the briefest interruption of service poses a major risk to consumer confidence and trust in brands. To counter these risks, organizations must take a proactive approach to performance monitoring, especially when it comes to AI-powered applications that are quickly becoming part of daily business operations. Unlike traditional applications, AI-driven systems often work autonomously and make split-second decisions based on enormous amounts of data.
Any disruption to these systems can lead to a cascade of errors and delays, resulting in disruption of user interactions and ultimately a loss of trust in the brand. Real-time visibility into application performance allows companies to quickly detect anomalies, optimize functionality, and maintain seamless user interactions. The ability to quickly identify and address issues as they arise allows IT teams to maintain operational continuity and limit potential damage.
Predictive analytics and AI-powered anomaly detection play a critical role in preemptively identifying potential issues before they disrupt end-user experiences. As the reliance on AI technologies continues to grow, uninterrupted service will only become a more essential business necessity. Still, achieving early detection can be a challenge.
Many businesses still rely on basic uptime monitoring, often limited to monitoring just their homepage, leaving them vulnerable to periodic or partial site outages when an AI-dependent service goes down. To defend against AI-induced downtime, organizations must implement holistic monitoring strategies, such as Internet Performance Monitoring (IPM), that cover the entire spectrum of AI-driven applications, from the frontend interfaces to the backend data processing pipelines.
By proactively monitoring AI dependencies and deploying robust performance management frameworks, companies can mitigate the risks of costly downtime and maintain operational continuity in an increasingly AI-driven landscape. Consider this a call to action to think ahead and best protect the business community by anticipating these challenges and equipping operations teams to best manage them.
We have the best network monitoring tool.
This article was produced as part of Ny BreakingPro’s Expert Insights channel, where we profile the best and brightest minds in today’s technology industry. The views expressed here are those of the author and are not necessarily those of Ny BreakingPro or Future plc. If you are interested in contributing, you can read more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro