Back to list
Apr 16, 2026
51
0
0
ClaudeNEW

Claude Performance Decline: Inside Anthropic's 'Effort Level' Controversy and User Backlash

Anthropic faces growing backlash as developers document Claude's performance regression, traced to a quiet reduction in default 'effort' level to conserve compute.

#Anthropic#Claude#Performance#Effort Level#Developer Experience
Claude Performance Decline: Inside Anthropic's 'Effort Level' Controversy and User Backlash
AI Summary

Anthropic faces growing backlash as developers document Claude's performance regression, traced to a quiet reduction in default 'effort' level to conserve compute.

Introduction

In mid-April 2026, Anthropic found itself at the center of a growing user revolt. Developers and enterprise customers reported significant performance degradation across Claude models, with complaints ranging from sloppy code generation to an inability to follow multi-step instructions. The root cause, as confirmed by Anthropic executive Boris Cherny, was a deliberate reduction in Claude's default "effort" level from high to medium — a cost-saving measure that quietly altered how the model processes every request. With Anthropic's annualized revenue reaching $30 billion and an IPO valuation of approximately $380 billion, the controversy raises fundamental questions about the tension between scaling a business and maintaining product quality.

Feature Overview

What Changed: The Effort Level Reduction

At the core of the controversy is Anthropic's decision to lower Claude's default effort parameter. This setting controls how many tokens the model processes when generating a response — essentially, how hard it "thinks" before answering. By reducing the default from high to medium, Anthropic cut per-request compute costs at the expense of response depth and accuracy.

Boris Cherny, the executive leading Claude Code, explained that the change responded to user feedback that Claude was "consuming too many tokens per task." However, critics argue this justification conflates two different user segments: casual users who prefer faster, cheaper responses and professional developers who depend on thorough, accurate outputs for production workflows.

Documented Regression: The 6,852-Session Analysis

The most detailed evidence of regression came from Stella Laurenzo, Senior Director in AMD's AI group. Laurenzo analyzed 6,852 Claude Code sessions comprising 17,871 thinking blocks and 234,760 tool calls. Her findings documented a clear behavioral shift from late February into early March 2026:

  • Claude changed from a "research-first" approach (reading context thoroughly before acting) to an "edit-first" style (making changes with minimal context review)
  • The model reads less context before acting on requests
  • Error rates increased, requiring significantly more user intervention
  • Complex engineering workflows that previously completed reliably began failing

Laurenzo concluded that "Claude has regressed to the point it cannot be trusted to perform complex engineering."

Broader User Complaints

The complaints extended well beyond a single user. Microsoft researcher Dimitris Papailiopoulos reported "incredibly frustrating sessions with Claude Code" with the model being "extremely sloppy." Multiple developers on social media documented similar patterns: Claude taking shortcuts instead of thorough analysis, failing to follow multi-step instructions, and generating code that required extensive manual correction.

An independent analysis by one researcher claimed a 67% drop in Claude Code's task completion rate, though Anthropic has not confirmed or denied this specific figure.

Anthropic's Response

Anthropic's official position includes several points:

  1. The changelog documented the effort level change
  2. The change responded to legitimate user feedback about excessive token consumption
  3. Teams and Enterprise accounts would be defaulted to "high effort" going forward
  4. The company denied purposely degrading performance

However, the response has not fully addressed user concerns. Many pointed out that a changelog entry is insufficient notification for a change that fundamentally alters model behavior. Others questioned whether the effort change alone explains the magnitude of observed regression.

Usability Analysis

For individual developers on Pro plans, the practical impact has been substantial. Workflows that previously required minimal oversight now demand constant monitoring and correction. The shift from research-first to edit-first behavior is particularly damaging for complex engineering tasks where understanding the full context before making changes is essential.

For enterprise teams, Anthropic's commitment to default high effort on Teams and Enterprise plans partially mitigates the issue. However, the trust damage extends beyond the technical fix. Enterprise customers choosing an AI provider for mission-critical workflows need confidence that model behavior will not be silently altered for cost optimization.

The workaround for affected users is straightforward: explicitly set the effort parameter to "high" in API calls or Claude Code configurations. But the fact that users must actively work around a degradation introduces friction and erodes the seamless experience that drove Claude's adoption.

Pros and Cons

Pros

  • Anthropic acknowledged the issue and committed to restoring high effort for enterprise accounts
  • The effort parameter is user-configurable, providing a workaround for affected professionals
  • The controversy has prompted broader industry discussion about AI model versioning transparency
  • Anthropic's $30 billion annualized revenue demonstrates strong underlying product-market fit

Cons

  • The default effort reduction was implemented without adequate user notification
  • Performance regression was severe enough to break established professional workflows
  • Trust has been damaged among power users who are Anthropic's most vocal advocates
  • Speculation about compute capacity constraints raises questions about scaling reliability
  • The response appeared defensive rather than proactively transparent

Outlook

This controversy is a pivotal moment for Anthropic. The company is approaching a potential IPO with a $380 billion valuation, and enterprise trust is foundational to that valuation. If Anthropic successfully restores high-effort defaults for professional users and implements clearer communication protocols for model behavior changes, the incident may be remembered as a growing pain rather than a turning point.

However, the underlying tension is structural. As AI companies scale to millions of users, the compute cost per request becomes a critical business metric. Every company in the space, including OpenAI and Google, faces the same pressure to optimize inference costs. Anthropic is simply the first to face public backlash for how it handled the trade-off.

The broader industry lesson is clear: changes to model behavior that affect output quality must be communicated prominently, not buried in changelogs. Users building professional workflows on AI models need the same versioning guarantees that they expect from any other production dependency.

Conclusion

The Claude performance controversy is not just a customer service issue — it is a case study in the tension between AI business economics and user trust. Anthropic's decision to reduce default effort levels was economically rational but executed with insufficient transparency. For professional developers and enterprise teams evaluating Claude, the key takeaway is to explicitly configure effort levels and pin model versions rather than relying on defaults. For the industry, this episode underscores that as AI models become critical infrastructure, providers must treat behavioral changes with the same rigor as breaking API changes.

Pros

  • Anthropic acknowledged the issue and committed to restoring high effort for enterprise accounts
  • The effort parameter is user-configurable, providing an immediate workaround
  • The controversy has accelerated industry discussion about AI model versioning and transparency
  • Underlying product-market fit remains strong at $30 billion annualized revenue

Cons

  • Default effort reduction was implemented without adequate prominent user notification
  • Performance regression broke established professional workflows for power users
  • Trust damaged among developers who are Anthropic's most vocal advocates and evangelists
  • Speculation about compute constraints raises reliability concerns for enterprise scaling
  • Response was perceived as defensive rather than proactively transparent

Comments0

Key Features

1. Default effort level reduced from high to medium — controlling how many tokens Claude processes per response 2. Documented regression in Claude Code: shift from 'research-first' to 'edit-first' behavior pattern 3. 6,852-session analysis by AMD's Stella Laurenzo quantifying the performance decline 4. Enterprise and Teams accounts to be restored to high-effort defaults 5. User-configurable effort parameter available as workaround via API and Claude Code settings

Key Insights

  • The effort level reduction reveals the structural tension between scaling AI businesses and maintaining output quality — a challenge every AI provider faces
  • A 6,852-session empirical analysis from AMD's AI group provides the most rigorous public documentation of LLM regression to date
  • Anthropic's changelog-only notification for a behavior-altering change sets a negative precedent for AI model versioning transparency
  • The distinction between casual users (who want faster responses) and professional users (who need thorough analysis) demands tiered default configurations
  • Enterprise trust damage from silent behavior changes may be more costly than the compute savings achieved
  • Speculation about compute capacity constraints raises questions about whether rapid revenue growth ($9B to $30B) has outpaced infrastructure scaling
  • The workaround of explicitly setting effort to 'high' highlights the need for professional users to treat AI model configurations as production dependencies

Was this review helpful?

Share

Twitter/X