Mechanistic interpretability of reinforcement learning in Medicaid care coordination

A VPN is an essential component of IT security, whether you’re just starting a business or are already up and running. Most business interactions and transactions happen online and VPN

Objective

To expose reasoning pathways of a reinforcement learning policy for Medicaid care coordination, develop an error taxonomy and implement fairness-aware guardrails.

Design

Retrospective interpretability audit using attention analysis, Shapley explanations, sparse autoencoder feature discovery and blinded clinician adjudication.

Setting

Medicaid care coordination programmes in Washington, Virginia and Ohio (July 2023–June 2025).

Participants

250 000 intervention decisions; 200 divergent cases reviewed by five clinicians.

Main outcome measures

Calibrated harm prediction; algorithmic clearance and residual harm rates; error taxonomy frequencies; subgroup fairness metrics.

Results

The conformal model achieved area under the receiver operating characteristic curve of 0.80 (95% CI 0.78 to 0.82), clearing 89.5% (95% CI 88.9% to 90.1%) of decisions with 1.22% (95% CI 1.14% to 1.30%) residual harm versus 6.67% (95% CI 6.02% to 7.32%) for flagged decisions. Sparse autoencoders identified seven reasoning motifs linking social determinants to clinical cascades. The error taxonomy revealed premise errors (48%, 95% CI 41% to 55%), calibration failures (27%, 95% CI 21% to 33%) and contextual blind spots (25%, 95% CI 19% to 31%). Divergence was higher for telehealth visits (11.2%) and behavioural health patients (10.7% vs 6.9%, p<0.001). Fairness optimisation reduced race-group disparity by 37% (95% CI 22% to 48%) and sex-group disparity by 28% (95% CI 14% to 39%). Reviewers rated 23% (95% CI 17% to 29%) of overridden recommendations as well-matched, confirming appropriate human oversight.

Conclusions

Mechanistic interpretability transforms opaque algorithmic assistance into auditable decision support, providing a governance scaffold for clinical artificial intelligence deployment.

Basu, S., Patel, S., Sheth, P., Muralidharan, B., Elamaran, N., Kinra, A., Batniji, R.

Basu, S., Patel, S., Sheth, P., Muralidharan, B., Elamaran, N., Kinra, A., Batniji, R.

Leave a Replay

Sign up for our Newsletter

Contact Us