Microsoft researchers warned that “even frontier models” corrupt an average of 25% of document material during extended workflows.

Testing 19 LLMs on a bespoke set of work environments across 52 domains, the researchers found that the models on average degraded 50% of the material they were given to work with, as tasks progressed. 

Get the full story: Subscribe for free

Join peers managing over $100 billion in annual IT spend and subscribe to unlock full access to The Stack’s analysis and events.

Subscribe now

Already a member? Sign in