← Back to feed Tech & Digital

SWE-bench Verified no longer measures frontier coding capabilities

Hacker News Best 26 April 2026 8h ago
SWE-bench Verified no longer measures frontier coding capabilities
63
Relevance
3/25
Freshness
25/25
Authority
18/20
Brand Signal
15/15
Depth
2/15
Relevance Freshness Authority Brand Depth
Article URL: https://openai.com/index/why-we-no-longer-evaluate-swe-bench-verified/ Comments URL: https://news.ycombinator.com/item?id=47910388 Points: 211 # Comments: 124
Read Full Article → Hacker News Best ↗