Performance drift in a nationally deployed population health risk algorithm in the US Veterans Health Administration Authors
This retrospective cohort study examined performance drift in the Veterans Health Administration's Care Assessment Needs (CAN) algorithm, a nationally deployed risk tool used to guide clinical decisions and allocate resources for more than 5 million veterans annually. Analyzing over 27 million observations across more than 7 million unique veterans, the study found that the algorithm's positive predictive value decreased by 4% and its false positive rate increased by 0.34% between 2016 and 2021, resulting in over 18,000 additional false positives. Shifts in demographic data, health care utilization patterns, and laboratory values, particularly during the COVID-19 pandemic, were identified as significant drivers of this decline.
These findings suggest that close surveillance of clinical risk algorithms and the quality metrics derived from them is essential, as performance degradation can go undetected for years after deployment. The study highlights a critical and often overlooked challenge in AI-driven healthcare: algorithms trained on historical data may quietly become less reliable as patient populations and care patterns evolve, with real consequences for clinical decision-making and resource allocation.