CrowdStrike Reveals What Went Wrong — And It’s Pretty Much What We Expected
CrowdStrike has released initial findings into the root cause of the recent incident that affected millions of Windows devices worldwide.
In a preliminary Post-incident assessment (PIR), the company acknowledged that serious issues were caused by a content configuration update, which led to a massive crash of the Windows ecosystem on July 19.
The incident, believed to have affected 8.5 million Windows machines, occurred after a routine update intended to improve telemetry for detecting new threat techniques. In this case, the problematic update led to out-of-bounds memory reads, triggering the infamous blue screen of death.
CrowdStrike provides more details on the recent outage
The issue affected Windows hosts running sensor version 7.11 and later that were online between 04:09 and 05:27 UTC on the day of the incident.
CrowdStrike CEO George Kurtz apologized and stressed that this was not the result of a cyberattack, but rather an internal software issue. He assured customers that steps are being taken to prevent similar issues in the future.
The core of the issue lies in the Rapid Response Content, which is designed to dynamically update the threat detection capabilities without changing the sensor code. The problematic update contained two new IPS Template Instances that were intended to detect attacks that exploited Named Pipes.
However, due to an error in the Content Validator, one of these instances with incorrect data failed the validation process, causing the crashes.
In response to the recent, widespread issues, CrowdStrike’s PIR outlines a number of steps to improve testing and deployment processes to prevent recurrence. These include more rigorous testing, phased deployment, enhanced monitoring, and more control for customers over their updates.
Additionally, more details are promised to be provided in the full Root Cause Analysis, which the company has pledged to make public. In the meantime, CrowdStrike says it is working with impacted customers to continue restoring normal operations.