anthropic opus - Search News

News

Qodo Command Enters AI Coding Agent Wars With 71.2% SWE-Bench Score

AI startup Qodo has entered the fierce “benchmark war” for coding supremacy. On August 11, the company announced its new agent, Qodo Command, scored an impressive 71.2% on the SWE-bench Verified test.

13m

ChatGPT-5 underwhelming you? Here’s what it can do that older models couldn’t—and where other AI chatbots still shine

Last Thursday, OpenAI launched the latest version of its hyper-popular AI chatbot, ChatGPT. Sam Altman, OpenAI’s CEO, made ...

58m

OpenAI's performance charts in the GPT-5 launch video are such a mess you have to think GPT-5 itself probably made them, and the company's attempted fixes raise even more questions

OpenAI has since posted some updated charts on its website. The new deception rate chart certainly suggests that a mere mistake was made. The revised stats show GPT-5's coding deception rate at 16.5%, ...

The Information58m

Anthropic’s New Safety Lead; The Startup Making Nvidia’s Software Easier To Use

We might be a long way away from sipping Pina Coladas on a beach while AI-powered humanoid robots handle all our work. But we ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results