Rohan Paul / @rohanpaul_ai:
[Thread] A US paper shows the best frontier LLM models solve 0% of hard coding problems from Codeforces, ICPC, and IOI, domains where expert humans still excel — This is really BAD news of LLM's coding skill. ☹️ The best Frontier LLM models achieve 0% on hard real-life Programming Contest problems, domains where expert humans still excel. LiveCodeBench Pro, a benchmark composed of problems from Codeforces, ICPC, and IOI ("International [image]
Rohan Paul / @rohanpaul_ai:
[Thread] A US paper shows the best frontier LLM models solve 0% of hard coding problems from Codeforces, ICPC, and IOI, domains where expert humans still excel — This is really BAD news of LLM's coding skill. ☹️ The best Frontier LLM models achieve 0% on hard real-life Programming Contest problems, domains where expert humans still excel. LiveCodeBench Pro, a benchmark composed of problems from Codeforces, ICPC, and IOI ("International [image]
Source: TechMeme
Source Link: http://www.techmeme.com/250617/p7#a250617p7