AI news
February 25, 2025

Anthropic's Claude 3.7 Sonnet Is Finally Here

Anthropic just launched Claude 3.7 Sonnet, its most intelligent AI model to date and the first hybrid reasoning model on the market.

Jim Clyde Monge
by 
Jim Clyde Monge

Anthropic just launched its most intelligent AI model to date and the first hybrid reasoning model on the market—Claude 3.7 Sonnet. The hybrid part means the model works both as a reasoning model and a large language model (LLM).

While OpenAI recently announced that GPT-5 will be a unified model, Anthropic has already introduced Claude 3.7 Sonnet, which is capable of both quick responses and deeper reasoning, getting ahead in this particular approach to AI development.

This new model can “think” about questions for as long as the users ask it to, so depending on how long it considers things, its responses could be very different.

Claude 3.7 Sonnet can also build complex apps with a single prompt, and with the introduction of a new product called Claude Code, developers can now give substantial engineering tasks to Claude directly from their terminal.

Key Features of Claude 3.7 Sonnet

Claude 3.7 Sonnet brings several important features that set it apart from previous models and other AI systems on the market:

1. Extended Thinking Mode

Perhaps the most notable feature is the extended thinking capability. Unlike most AI models that give instant responses, Claude 3.7 Sonnet can take time to “think” before answering questions. This thinking process is visible to users, making the AI’s reasoning more transparent.

When using the API, users can control exactly how much thinking the model does. You can tell Claude to think for a specific number of tokens, up to its output limit of 128K tokens. This allows you to balance speed and cost against the quality of answers.

Here’s an example Typecript code for extended thinking:

import Anthropic from '@anthropic-ai/sdk';const client = new Anthropic();const response = await client.messages.create({  model: "claude-3-7-sonnet-20250219",  max_tokens: 20000,  thinking: {    type: "enabled",    budget_tokens: 16000  },  messages: [{    role: "user",    content: "Are there an infinite number of prime numbers such that n mod 4 == 3?"  }]});// Print both thinking process and final responseconsole.log(response);

The API response will include both thinking and text content blocks:

{    "content": [        {            "type": "thinking",            "thinking": "To approach this, let's think about what we know about prime numbers...",            "signature": "zbbJhbGciOiJFU8zI1NiIsImtakcjsu38219c0.eyJoYXNoIjoiYWJjMTIzIiwiaWFxxxjoxNjE0NTM0NTY3fQ...."        },        {            "type": "text",            "text": "Yes, there are infinitely many prime numbers such that..."        }    ]}

For people who need higher accuracy, especially on complex topics like math, physics, or coding, the extended thinking mode makes a big difference. The model can work through problems step by step, similar to how humans think, which leads to more reliable answers.

2. Larger Output Capacity

Claude 3.7 Sonnet supports up to 128K output tokens (in beta), which is over 15 times longer than before. This is very useful for:

  • Complex code generation
  • Detailed planning documents
  • Long-form writing
  • Handling large data analysis tasks

This expanded capacity means the model can handle much more complex tasks without running into token limits.

3. Improved Coding Abilities

As a developer, this is what I am most excited about. The model shows major improvements in coding across many areas:

  • Planning and solving complex coding tasks
  • Handling full-stack updates
  • Working with complex codebases
  • Building sophisticated web apps and dashboards from scratch
  • Producing production-ready code with fewer errors

Several tech companies like Cursor, Cognition, Vercel, and Replit have already tested Claude 3.7 Sonnet and found it performs better than other models for real-world coding tasks.

4. Reduced Unnecessary Refusals

According to Anthropic, Claude 3.7 Sonnet makes more careful distinctions between harmful and harmless requests, reducing unnecessary refusals by 45% compared to earlier models. This helps the AI be more helpful without constantly blocking reasonable requests.

This is huge because one of the reasons that I have used Claude less and less in the past couple of months is the high frequency of refusals. It was an annoying feature to be honest.

5. Claude Code

Claude Code is a brand new command line tool for what Anthropic calls “agentic coding.” Currently available as a limited research preview, it allows developers to give substantial engineering tasks to Claude directly from their terminal.

The tool acts as a coding partner that can:

  • Search and read code
  • Edit files
  • Write and run tests
  • Commit and push code to GitHub
  • Use command-line tools
  • Keep you informed at each step

In early testing, they found Claude Code could complete tasks in a single pass that would normally take 45+ minutes of manual work, cutting down development time.

Claude Code is currently available as a limited research preview. Developers interested in trying it would need to join the preview program.

Claude 3.7 Sonnet Performance

The performance of Claude 3.7 Sonnet shows significant improvements over previous models in several key areas:

Coding Performance

Claude 3.7 Sonnet has shown impressive results on coding benchmarks and real-world tests. It achieves state-of-the-art performance on SWE-bench Verified, which evaluates AI models’ ability to solve real-world software issues.

Claude 3.7 Sonnet Performance The performance of Claude 3.7 Sonnet shows significant improvements over previous models in several key areas: Coding Performance Claude 3.7 Sonnet has shown impressive results on coding benchmarks and real-world tests. It achieves state-of-the-art performance on SWE-bench Verified, which evaluates AI models’ ability to solve real-world software issues.
Image from Anthropic

Reasoning Performance

Anthropic also shared how Claude 3.7 Sonnet achieves state-of-the-art performance on TAU-bench, a framework that tests AI agents on complex real-world tasks with user and tool interactions.

Reasoning Performance Anthropic also shared how Claude 3.7 Sonnet achieves state-of-the-art performance on TAU-bench, a framework that tests AI agents on complex real-world tasks with user and tool interactions.
Image from Anthropic

The company says their goal with Claude Code is to better understand how developers use Claude for coding, which will help them make future model improvements.

Reasoning Performance

The extended thinking mode makes Claude 3.7 Sonnet much better at tasks that need careful reasoning:

  • Math and science problems show notable improvement
  • Complex planning tasks benefit from the step-by-step thinking process
  • Instruction-following becomes more precise
  • The model makes fewer errors on tasks that need several steps of reasoning
Reasoning Performance The extended thinking mode makes Claude 3.7 Sonnet much better at tasks that need careful reasoning: Math and science problems show notable improvement Complex planning tasks benefit from the step-by-step thinking process Instruction-following becomes more precise The model makes fewer errors on tasks that need several steps of reasoning
Image from Anthropic

This reasoning capability puts Claude 3.7 Sonnet in a new category of AI models that can think more deeply about problems rather than just generating text based on patterns.

How to Access Claude 3.7 Sonnet

Claude 3.7 Sonnet is now available on both the Claude website and through API access. To access it via a chat interface, you can try the following channels:

  • Web browser interface
  • iOS app
  • Android app

Simply switch to Claude 3.7 Sonnet from the model dropdown.

How to Access Claude 3.7 Sonnet Claude 3.7 Sonnet is now available on both the Claude website and through API access. To access it via a chat interface, you can try the following channels: Web browser interface iOS app Android app Simply switch to Claude 3.7 Sonnet from the model dropdown.
Image by Jim Clyde Monge

All Claude plans can access the model, including Free, Pro, Team, and Enterprise. However, the extended thinking mode is only available for paid plans (Pro, Team, and Enterprise).

Developers can also access Claude 3.7 Sonnet through:

  • Anthropic API
  • Amazon Bedrock
  • Google Cloud’s Vertex AI

When using the API, developers have full control over the model’s thinking budget, allowing them to specify how many tokens the model can use for thinking.

Here’s a sample API call using Typescript:

import Anthropic from '@anthropic-ai/sdk';const anthropic = new Anthropic({  apiKey: 'my_api_key', // defaults to process.env["ANTHROPIC_API_KEY"]});const msg = await anthropic.messages.create({  model: "claude-3-7-sonnet",  max_tokens: 1024,  messages: [{ role: "user", content: "Hello, Claude" }],});console.log(msg);

Claude 3.7 Sonnet Pricing

As I mentioned, Claude 3.7 Sonnet is included in the free tier account on claude.ai but without the extended thinking mode. You can also choose to upgrade your account to either the Pro ($20 per month) or Team ($30 per month).

  • Pro tier: Full access, including extended thinking mode
  • Team and Enterprise plans: Full access with additional features for organizations
Claude 3.7 Sonnet Pricing As I mentioned, Claude 3.7 Sonnet is included in the free tier account on claude.ai but without the extended thinking mode. You can also choose to upgrade your account to either the Pro ($20 per month) or Team ($30 per month). Pro tier: Full access, including extended thinking mode Team and Enterprise plans: Full access with additional features for organizations
Image from Anthropic website

Claude 3.7 Sonnet keeps the same pricing as previous models:

  • $3 per million input tokens
  • $15 per million output tokens

This pricing includes thinking tokens when using the extended thinking mode. For API users, there are options for cost savings:

  • Up to 90% cost savings with prompt caching
  • 50% cost savings with batch processing

Why is this such a big deal?

For me as a developer, having a more powerful AI model means I can have more confidence that it’ll have better awareness of my project’s code base and is more capable of generating more secure and more complete code.

The ability to understand context across an entire codebase is particularly valuable. Previous models often lost track of how different parts of a project fit together, but Claude 3.7 Sonnet seems to maintain a more cohesive understanding of large projects.

For researchers, the deep thinking capability of this model means there’s less chance of hallucinations, and it actually generates more meaningful and factual answers. The visible reasoning process also helps researchers understand how the model reached its conclusions, which is important for trust and verification.

For casual users, the responses from this new model are actually more reliable and less robotic. The longer context window and improved reasoning lead to conversations that feel more natural and helpful.

For AI developers, claude-3.7-sonnet and claude-3.7–sonnet-thinking are now supported in Cursor!

I haven’t done any extensive testing yet, but based on the user feedback on X, they are exceptionally well in terms of coding. Mckay Wrigley even calls it the best model in the world for code in his X post.

Here’s how you can switch to the new models in Cursor.

I haven’t done any extensive testing yet, but based on the user feedback on X, they are exceptionally well in terms of coding. Mckay Wrigley even calls it the best model in the world for code in his X post. Here’s how you can switch to the new models in Cursor.
Image by Jim Clyde Monge

I plan to do some testing and build sample apps to see how well the new Claude models handle app generation on Cursor. This will give me a better sense of their real-world capabilities beyond the benchmarks.

Final Thoughts

I was surprised to see Anthropic drop Claude 3.7 Sonnet out of the blue (or maybe I wasn’t paying so much attention on the leaks). I was actually expecting Claude 3.5 Opus to be released first but it seems that they have scrapped that model already.

Now, it’s clear that big tech companies are racing to release the best AI model with reasoning capabilities. It’s only been weeks since DeepSeek released R-1, then xAI launched Grok 3 with reasoning capabilities, and now we got Claude 3.7 Sonnet. It’s honestly a bit overwhelming and I don’t even know if the benchmarks from these tech companies are actually reliable.

What I am most excited about is the integration of Claude 3.7 Sonnet in coding tools like Cursor. I can’t wait to test it out by building more complex applications, and also learn more about the Claude Code which is also very interesting.

For developers especially, the improvements in coding abilities and the introduction of Claude Code could change how we work. Having an AI that can understand large codebases and handle substantial engineering tasks could free us to focus on more creative aspects of development.

While I’m cautious about some of the claims being made, Claude 3.7 Sonnet does point to a future where AI works alongside humans as a true thinking partner rather than just a fancy autocomplete tool. I’ll be testing it extensively to see if it lives up to the hype.

Get your brand or product featured on Jim Monge's audience