July 19, 2024

OpenAI Releases GPT-4o Mini: Faster and 60% Cheaper Than GPT-3.5 Turbo

GPT-4o mini is much more affordable and intelligent small model that's just as fast as GPT-3.5 Turbo.

by

Jim Clyde Monge

OpenAI has unveiled a new addition to its AI model lineup: the GPT-4o mini. This new model is much more affordable and intelligent small model that’s just as fast as GPT-3.5 Turbo. Similar to GPT-4o, GPT-4o mini has a 128k context window and a knowledge cut-off date of October 2023.

What is GPT-4o mini?

GPT-4o mini is designed to be more cost-efficient and accessible than its larger counterparts.

Here are some details about the new model:

Intelligence: GPT-4o Mini outperforms GPT-3.5 Turbo in textual intelligence, scoring 82% on MMLU compared to 69.8%, and excels in multimodal reasoning.
Price: GPT-4o Mini is over 60% cheaper than GPT-3.5 Turbo, costing 15 cents per 1 million input tokens and 60 cents per 1 million output tokens, roughly equivalent to 2,500 pages in a standard book.
Modalities: GPT-4o Mini currently supports both text and vision capabilities. OpenAI plans to add support for audio and video inputs and outputs in the future.
Languages: GPT-4o Mini has enhanced multilingual understanding compared to GPT-3.5 Turbo, performing better across a wide range of non-English languages.

Fun fact: The cost per token of GPT-4o mini has dropped by 99% since text-davinci-003, a less capable model introduced in 2022.

Evaluation Benchmarks

GPT-4o Mini is a small model that excels in textual intelligence and multimodal reasoning, outperforming GPT-3.5 Turbo and other small models across various academic benchmarks.

It supports the same range of languages as GPT-4o and shows strong performance in function calling and long-context tasks.

Evaluation Benchmarks GPT-4o Mini is a small model that excels in textual intelligence and multimodal reasoning, outperforming GPT-3.5 Turbo and other small models across various academic benchmarks. — Image from OpenAI

Key evaluation scores for GPT-4o Mini include:

Reasoning Tasks: Scored 82.0% on MMLU, compared to 77.9% for Gemini Flash and 73.8% for Claude Haiku.
Math and Coding Proficiency: Achieved 87.0% on MGSM (math reasoning) and 87.2% on HumanEval (coding), surpassing Gemini Flash and Claude Haiku.
Multimodal Reasoning: Scored 59.4% on MMMU, outperforming Gemini Flash and Claude Haiku.

API Pricing

OpenAI has made GPT-4o mini very affordable:

15 cents per million input tokens
60 cents per million output tokens

Here’s a more detailed breakdown of the pricing for each model variation.

OpenAI has made GPT-4o mini very affordable: 15 cents per million input tokens 60 cents per million output tokens — Image by Jim Clyde Monge

This pricing is over 60% cheaper than GPT-3.5 Turbo and an order of magnitude more affordable than previous models.

If you are a developer or a startup using small language models to power your apps, it’s a good idea to switch to GPT-4o mini now to reduce spending on API costs.

Compared that with GPT-3.5 with only 16K context window costs much higher.

GPT-4o mini also supports image generation. If you are interested in the vision pricing, check out the calculator here. A 1080 x 1080 pixel image would cost $0.003825.

I haven’t tried the vision modality via API yet. According to OpenAI, support for image, video and audio inputs and outputs is “coming in the future.”

Try it Yourself

Starting today, ChatGPT Free, Plus, and Team users can access GPT-4o Mini instead of GPT-3.5. Enterprise users will gain access next week, aligning with our mission to make the benefits of AI accessible to all.

If you are interested in comparing it with other models, GPT-4o mini is now available in https://arena.lmsys.org/.

The model is more cost-effective, faster, with a 128k context window, and the responses are significantly better. LMSYS have also collected votes from the public and scored in par of GPT-4-turbo whilst being cheaper.

With over 6K user votes, we are excited to share its early score reaching GPT-4-Turbo performance, while offering significant cost reduction (15/60 cent per million input/output tokens).

With over 6K user votes, we are excited to share its early score reaching GPT-4-Turbo performance, while offering significant cost reduction (15/60 cent per million input/output tokens). — Image from LMSYS

For developers, GPT-4o mini is now supported in Code Assistant extension in VS Code.

https://x.com/dani_avila7/status/1814012477799100662

Microsoft also announced that GPT-4o mini is now supported in Microsoft Azure AI.

Final Thoughts

I really like the pricing, multi-modal nature, and the response times that I’m seeing in the benchmarks. That sounds like a big jump compared to GPT-3.5 Turbo. More context, more possibilities at a much lower price.

I can’t wait to do evaluations myself and see if it’s worth switching my own products from GPT-3.5 turbo to GPT-4o mini.

However, it’s worth noting that while the API may be cheap, it doesn’t beat the open-source nature of models like Llama 3. Open-source models allow users to fine-tune them with their own data and run them on their own hardware, which can be a big advantage for some users.

‍

Stay ahead. Stay updated.