AI
After "chaotic sprint" attempts with rule-breaking human prompting, OpenAI believes it can now beat First Proof type challenges with a simple prompting strategy.
OpenAI is teasing an unreleased model it says is capable of groundbreaking mathematical research, complete with a prompting strategy to automatically generate proofs.
On Friday, OpenAI published an updated report of its submission to the First Proof challenge, which features questions developed by eminent mathematicians to test the practical research capabilities of AI.
The 10 First Proof problems have human-derived answers that were never published, preventing the models from proposing solutions found in other materials and presenting them as novel.
Under the challenge rules, humans cannot assist the models.
OpenAI claims its unreleased model may have solved five out of the 10 questions. However, researchers prompted the models and didn't keep clear records of the transcripts.
After releasing its initial findings, OpenAI researchers on Friday said they have since come up with a technique that would produce similar results without the human intervention. However, they've only tested it on one of the proofs.
Join peers managing over $100 billion in annual IT spend and subscribe to unlock full access to The Stack’s analysis and events.
Already a member? Sign in