AIlarge language modelsAnthropic Claude
Anthropic's Claude Opus 4.5 is cheaper and beats humans in coding.
In a move that significantly escalates the artificial intelligence arms race, Anthropic has unleashed Claude Opus 4. 5, a model that not only undercuts its predecessor's price by a staggering two-thirds but also demonstrates a qualitative leap in reasoning that has seen it outperform every human candidate on the company's own rigorous engineering assessment.This isn't merely an incremental update; it's a strategic volley fired directly at the fortresses of OpenAI and Google, backed by Amazon's deep pockets. The new pricing structure—$5 per million input tokens and $25 per million output tokens, down from $15 and $75—radically democratizes access to frontier AI capabilities, a calculated gambit to capture the developer and enterprise market by making powerful AI both superior and more affordable.The core of Anthropic's claim rests on formidable benchmark results. On the demanding SWE-bench Verified, which measures real-world software engineering tasks, Opus 4.5 achieved 80. 9% accuracy, edging out OpenAI's recently released GPT-5.1-Codex-Max (77. 9%) and Google's Gemini 3 Pro (76.2%). More startling, however, is its performance on Anthropic's internal take-home exam for prospective performance engineers.Using a technique called parallel test-time compute, which aggregates multiple model attempts, Opus 4. 5 scored higher than any human applicant in the company's history, a milestone that forces a serious conversation about the future of software engineering as a profession.As Alex Albert, Anthropic’s head of developer relations, noted, this is a clear signal of the models' escalating utility in a work context, particularly in fields like engineering where they are already pulling ahead. But the raw scores only tell part of the story.The more profound advancement, according to early testers, is in the model's emergent judgment and intuition. Developers report that Opus 4.5 simply 'gets it,' demonstrating a nuanced understanding of context and priority that allows for delegation of more complex, multi-step tasks, such as synthesizing information from Slack and internal documents into coherent summaries. This shift from a tool that retrieves information to one that synthesizes and prioritizes it represents a fundamental change in human-AI collaboration.Beyond sheer intelligence, Anthropic is betting big on efficiency. The company claims Opus 4.5 achieves similar or better outcomes using dramatically fewer computational resources. At a medium effort level, it matched the previous Sonnet 4.5 model's best score on SWE-bench while using 76% fewer output tokens. This efficiency is compounded by a new 'effort parameter' that gives developers granular control over the trade-off between performance, latency, and cost—a crucial feature for applications running at scale.Early enterprise validations support these claims, with Replit's president highlighting how the efficiency 'compounds at scale,' and GitHub's chief product officer noting the model's particular aptitude for code migration and refactoring. Perhaps the most futuristic capability showcased is the emergence of 'self-improving agents.' Rakuten, the Japanese e-commerce giant, reported that agents built on Opus 4. 5 were able to autonomously refine their own capabilities, achieving peak performance in just four iterations where other models failed to match the quality after ten.It's critical to clarify that the model isn't performing online learning or updating its own foundational weights; rather, it's iteratively improving the tools and approaches it uses for a specific task, a form of meta-reasoning that hints at a new paradigm for automated skill optimization. This extends beyond code into professional document creation, spreadsheets, and presentations, with users reporting this generational jump is the largest they've witnessed.The release is accompanied by a suite of product updates aimed squarely at enterprise integration, including general availability of Claude for Excel with pivot table and chart support, and the introduction of 'infinite chats,' which effectively eliminates context window limitations through intelligent conversation summarization. For developers, programmatic tool calling allows Claude to write and execute code that invokes functions directly, further blurring the line between assistant and autonomous agent.This aggressive push comes as Anthropic rockets to $2 billion in annualized revenue, with an eightfold year-over-year increase in customers spending over $100,000. The frenetic pace of releases—Opus 4.5 follows Haiku 4. 5 in October and Sonnet 4.5 in September—mirrors the broader industry tempo, with OpenAI and Google in a relentless sprint. Albert candidly admitted that Anthropic is now using Claude to accelerate its own development, a recursive loop that promises to further compress innovation cycles.While this competition delivers ever-improving capabilities at falling prices, the profitability of leading AI labs remains a distant horizon, overshadowed by massive investments in compute and talent. The projection of a trillion-dollar AI market within a decade feels increasingly plausible, yet no single entity has established dominance.With Opus 4. 5, Anthropic has not just released a better model; it has fired a decisive shot in the battle to define the infrastructure of the next computing era, forcing the industry to confront the reality that AI is no longer just a tool, but an emerging peer in complex cognitive domains.
#Anthropic
#Claude Opus 4.5
#AI model
#coding performance
#price reduction
#featured