← Back to feed Fashion & Style

Show HN: OSS Agent I built topped the TerminalBench on Gemini-3-flash-preview

Hacker News Best 27 April 2026 10h ago
Show HN: OSS Agent I built topped the TerminalBench on Gemini-3-flash-preview
69
Relevance
3/25
Freshness
25/25
Authority
18/20
Brand Signal
14/15
Depth
9/15
Relevance Freshness Authority Brand Depth
Scored 65.2% vs google's official 47.8%, and the existing top closed source model Junie CLI's 64.3%. Since there are a lot of reports of deliberate cheating on TerminalBench 2.0 lately ( https://debugml.github.io/cheating-agents/ ), I would like to also clarify a few things 1. Absolutely no {agents/skills}.md files were inserted at any point. No cheating mechanisms whatsoever 2. The cli agent was
Read Full Article → Hacker News Best ↗