Cybersecurity firm CrowdStrike on Wednesday blamed a problem in its verification system that caused millions of Windows devices to crash as part of widespread blackout at the end of last week.
“On Friday, July 19, 2024 at 04:09 UTC, as part of regular operations, CrowdStrike released a content configuration update for the Windows sensor to collect telemetry data about possible new threat methods,” the company said in a statement. said in its preliminary post-incident review (PIR).
“These updates are a regular part of the dynamic protection mechanisms of the Falcon platform. A faulty Rapid Response Content configuration update caused Windows to crash.”
The incident affected Windows hosts with sensors version 7.11 and higher that were online between July 19, 2024, 04:09 UTC and 05:27 UTC and received the update. Apple’s macOS and Linux systems are unaffected.
CrowdStrike reports that it delivers security content configuration updates in two ways: one through the sensor content that comes with the Falcon sensor, and the other through Rapid Response Content, which allows it to flag new threats using different behavioral pattern matching methods.
The crash is said to be the result of a quick response content update that contained a previously undetected bug. It should be noted that such updates are delivered in the form of behavior-specific template instances – which map to specific template types – to include new telemetry and detection.
Template instances are in turn created using the content configuration system, then deployed to the sensor via the cloud via a mechanism called Channel Files, which are ultimately written to disk on the Windows machine. The system also includes a Content Validator component that validates content before it is published.
“Rapid response content provides visibility and detection on the sensor without the need to change sensor code,” it explained.
“This capability is used by threat detection engineers to collect telemetry data, identify indicators of adversary behavior, and conduct detection and prevention. Quick Response Content is a behavioral heuristic separate and distinct from CrowdStrike’s sensor-based AI prevention and detection capabilities.”
These updates are then analyzed by Falcon’s sensor content interpreter, which then helps the sensor detection engine detect or prevent malicious activity.
While each new pattern type is stress tested for various parameters such as resource usage and performance impact, the root cause of the problem, according to CrowdStrike, can be traced back to the February 28, 2024 deployment of the Interprocess Communication (IPC) pattern type, which was introduced for flag attacks which said pipes.
The timeline of events is as follows –
- February 28, 2024 – CrowdStrike releases sensor 7.11 to customers with new IPC template type
- March 5, 2024 – The IPC template type is stress tested and validated for use
- March 5, 2024 – IPC template instance released to production via channel file 291
- April 8 – 24, 2024 – Three more instances of IPC templates deployed in production
- July 19, 2024 – Deployed two additional instances of the IPC template, one of which passes validation despite having problematic content data
“Based on testing prior to Template Type’s initial deployment (March 5, 2024), confidence in the checks performed in Content Validator, and previous successful IPC Template Instance deployments, these instances have been deployed to production,” CrowdStrike said. .
“Once received by the sensor and loaded into the Content Interpreter, the problematic content in Channel File 291 resulted in an out-of-memory read, causing an exception. This unexpected exception could not be handled properly, resulting in a Windows operating system crash ( BSoD).”
In response to Art breakdowns caused by the crash and to prevent it from happening again, the Texas-based company said it has improved its testing processes and refined the error handling mechanism in Content Interpreter. It also plans to implement a strategy of phased deployment of rapid response content.