The year 2025 saw major advances in the reasoning capabilities of large language models, where models produce explicit reasoning trajectories before a final answer. However, intermediate reasoning ...