Meta AI's latest and biggest open-source language model, Llama 3.1, is here.
It’s only been hours since Meta dropped Llama 3.1, which beats the best closed-source language models like GPT-4o, Gemma 2, and Claude 3.5 Sonnet on selected benchmarks.
The Llama 3.1 family includes multilingual models supporting French, German, Hindi, Italian, Portuguese, Spanish, and Thai, with parameter sizes of 8 billion, 70 billion, and a whopping 405 billion. The 405B model, trained using over 16,000 Nvidia H100 GPUs, boasts a context window of up to 128K tokens.
However, as I’ve mentioned before, benchmarks don’t always reflect real-world performance. So, let me show you five free ways to experience Llama 3.1 for yourself.
Let’s get started.
Groq, known for its specialized hardware and software designed to accelerate AI inference workloads, now hosts Llama 3.1 in the Groq Playground.
You may notice that the 405B parameter model is not currently available in the playground; you can try it on Groq Chat.
One thing I really love about Groq is that it’s really fast. Their LPU (Learning Processing Unit) can achieve industry-leading inference speeds, such as 250 tokens per second on the 70B parameter model and over 1,200 tokens per second on the 8B model.
HuggingChat is an open-source AI chatbot developed by Hugging Face, a popular platform where users can host generative AI models. To get started, visit Hugging Chat and create a free account.
Under the settings page and activate the meta-llama/Meta-Llama-3.1–405B-Instruct-FP8
model.
Once you close the modal window, you can start interacting with the model.
HuggingChat also offers additional tools that enhance its capabilities, such as web search and PDF support. For example, I enabled the image generation tool and tested it with the following prompt:
Prompt: Generate an image of a dog
Note that the image generation is a demonstration of the custom tools calling capability. That means Llama 3.1 is calling external tools that are hooked to HuggingChat to generate the image.
Firewords is a platform to build and deploy generative AI APIs. They have a dedicated page where you can try out language models like Llama 3.1 for free.
You can adjust the parameter settings in the Options section and make an API call with your configuration.
Here’s a sample API call from the example above:
await fetch("https://api.fireworks.ai/inference/v1/chat/completions", {
method: "POST",
headers: {
"Accept": "application/json",
"Content-Type": "application/json",
"Authorization": "Bearer <API_KEY>"
},
body: JSON.stringify({
model: "accounts/fireworks/models/llama-v3p1-405b-instruct",
max_tokens: 16384,
top_p: 1,
top_k: 40,
presence_penalty: 0,
frequency_penalty: 0,
temperature: 0.6,
messages: []
})
});
Take note that you will need an API key for this to work. Check out their documentation for more details.
Unlike the other platforms on this list, all three models are supported in Fireworks AI.
Cloudflare is one of the biggest networks operating on the Internet. People use Cloudflare services for the purposes of increasing the security and performance of their web sites and services.
Recently, they launched an AI playground that allows users to explore different text-generation models. Head over to Cloudflare Playground, select the model, and start chatting away.
It’s completely free and does not even require you to create an account on the platform. However, it’s unclear how many messages you can send per day.
Ollama is an advanced AI tool that allows users to easily set up and run large language models (LLMs) locally on their machines.
Download and install Ollama from here. To confirm that the installation is successful, run the ollama -v
on a terminal and you should be able to see the version number.
To start using Llama 3.1, run the following command to download the 8B model and begin inference:
Running the 405 billion parameter model would require some seriously powerful hardware, so I would suggest trying out the 8 billion model first.
That’s about it. You can watch the YouTube video here:
Poe is hands-down one of the best ways to try the new Llama models for free. Just go to https://poe.com and create an account.
In the official bots section, look for the Llama-3.1–405B-T
bot and open it to start chatting with Llama 3.1. The bot is hosted by Together.ai.
Note that Poe gives you 3,000 free points per day. Each message is worth 485 points, so that means you have 6 free messages per day.
I intentionally did not include Meta AI on this list because it’s already given and my aim for this story is to provide my readers other ways to access Llama 3.1 for free.
I hope you find this resource helpful, and if you know other ways to try Llama 3.1 for free, let me know in the comments. Cheers!
Software engineer, writer, solopreneur