June 21, 2024

Claude 3.5 Sonnet Outperforms GPT-4o

Anthropic released a new and free-to-access AI model Claude 3.5 Sonnet.

by

Jim Clyde Monge

When OpenAI released the GPT-4o model and made it completely free for everyone, a big chunk of Claude and Gemini users, including myself, switched to ChatGPT. The $20 per month subscription for Claude Opus and Gemini Pro isn’t worth it anymore because GPT-4o was outperforming these models on various tasks.

But today, Anthropic is trying to win back its users with Claude 3.5 Sonnet, which they claim is better than GPT-4o. What’s even more exciting? It’s completely free for everyone!

What is Claude 3.5 Sonnet?

Claude 3.5 Sonnet is the newest version of Anthropic’s family of AI models. It is designed to understand and create text better than before. Building on the previous versions, Claude 3.5 Sonnet is set to be more accurate, faster, and more reliable.

Anthropic wants it to compete directly with OpenAI’s GPT-4o and hopes users will come back for its new capabilities. This new model also outperforms Anthropic’s own Claude 3 Opus model on a wide range of evaluations, with speed and cost.

Claude 3.5 Sonnet is the first release in the forthcoming Claude 3.5 model family. So it means Anthropic is cooking up something even bigger with the next iteration of the Opus model.

If you aren’t familiar with the Claude models yet, here’s a full list.

Anthropic model name list — Image from Anthropic

As you can see, Claude 3.5 Opus and Claude 3.5 Haiku are still labeled as “Coming soon…”

What’s New in Claude 3.4 Sonnet?

The new model is packed with several improvements:

It’s faster and more intelligent
Improved performance
Advanced vision capabilities
A new feature called “Artifacts”

Enhanced Intelligence and Speed

Claude 3.5 Sonnet sets new benchmarks in several areas:

Graduate-Level Reasoning (GPQA)
Undergraduate-Level Knowledge (MMLU)
Coding Proficiency (HumanEval)

The model is now twice as fast as Claude 3 Opus, making it perfect for complex tasks like customer support and managing multi-step workflows.

Improved Performance

Claude 3.5 Sonnet shows marked improvement in understanding nuance, humor, and complex instructions.

It excels at writing high-quality content that sounds natural and relatable. In an internal coding evaluation, Claude 3.5 Sonnet solved 64% of problems, outperforming Claude 3 Opus, which solved only 38%.

It can write, edit, and execute code independently, handle code translations easily, and is effective for updating legacy applications and migrating codebases.

In my experience, Claude Opus was already great at solving programming tasks compared to GPT-4 and Gemini Pro. With Claude 3.5 Sonnet, it’s expected to perform even better!

Advanced Vision Capabilities

Claude 3.5 Sonnet is also Anthropic’s best vision model yet. It surpasses Claude 3 Opus in tasks requiring visual reasoning, like interpreting charts and graphs, and can accurately transcribe text from imperfect images.

Take a look at these examples where Claude 3.5 Sonnet generates interactive charts and even creates presentations from JSON input.

This makes it very useful in retail, logistics, and financial services, where AI can gain more insights from images and graphics than text alone.

I can’t wait to try this new feature and see how it compares to ChatGPT’s interactive charts and tables.

Claude Artifacts

A standout feature of what was announced today is Artifacts.

When a user asks Claude to generate content like code snippets, text documents, or website designs, these Artifacts appear in a dedicated window next to their conversation.

This creates a dynamic workspace where users can see, edit, and build on Claude’s creations in real time.

Here’s an example:

Prompt: Write a simple Python code for a tic tac toe game

Anthropic Claude 3.5 Sonnet example — Image by Jim Clyde Monge

def print_board(board): for row in board: print(" | ".join(row)) print("---------") def check_winner(board, player): # Check rows, columns, and diagonals for i in range(3): if all(board[i][j] == player for j in range(3)) or \ all(board[j][i] == player for j in range(3)): return True if all(board[i][i] == player for i in range(3)) or \ all(board[i][2-i] == player for i in range(3)): return True return False def is_full(board): return all(cell != " " for row in board for cell in row) def play_game(): board = [[" " for _ in range(3)] for _ in range(3)] current_player = "X" while True: print_board(board) row = int(input(f"Player {current_player}, enter row (0-2): ")) col = int(input(f"Player {current_player}, enter column (0-2): ")) if board[row][col] == " ": board[row][col] = current_player if check_winner(board, current_player): print_board(board) print(f"Player {current_player} wins!") break elif is_full(board): print_board(board) print("It's a tie!") break current_player = "O" if current_player == "X" else "X" else: print("That cell is already occupied. Try again.") if __name__ == "__main__": play_game()

As you can see, a new panel on the right side opens up to show the full code. This is much better in terms of UX than the previous in-line code rendering.

But here’s what is more mind-blowing, I asked Claude 3.5 Sonnet to render the game on the Artifacts canvas and play the game with me. This is crazy!

Anthropic Claude 3.5 Sonnet example game — Image by Jim Clyde Monge

You can turn on this feature from the Feature Preview modal window

Anthropic Claude 3.5 Sonnet Artifact — Image by Jim Clyde Monge

Future updates will support team collaboration, allowing entire organizations to centralize their knowledge, documents, and ongoing work in one shared space, with Claude as an on-demand teammate.

How to Access Claude 3.5 Sonnet?

Claude 3.5 Sonnet is now available for free on Claude.ai and the Claude iOS app. Those subscribed to Claude Pro and Team plans can access the model with much higher rate limits.

For devs, 3.5 Sonnet is also available via the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. The cost is $3 per million input tokens and $15 per million output tokens, with a 200K token context window.

Final Thoughts

Overall, this is huge news for free-tier users. A significant chunk of Claude users on the paid tier had switched to ChatGPT when OpenAI made GPT-4o and GPTs free for everyone. Even I canceled my Claude subscription to use GPT-4o instead.

Making Sonnet 3.5 free to access is a strategic move by Anthropic to bring back its user base. At what cost? We don’t know yet. What’s important is that these big tech companies are making their AI models free for more people. They are likely working on bigger models that users will pay for.

Additionally, the release of Artifacts is a huge plus for Claude fans. It’s been a while since Anthropic released a new feature that stands out from the competition. I will need to test it to see how useful it is and if it’s worth using in daily work.

That’s about it. Please let me know what you think about Claude 3.5 Sonnet in the comments.

‍

Stay ahead. Stay updated.