AI news
July 13, 2024

Try Google's Gemma 2 9B and 27B For FREE

Here are 5 websites to try Google's most powerful lightweight model, Gemma 2 9B and 27B.

Jim Clyde Monge
by 
Jim Clyde Monge

This article is originally posted in generativeai.pub

It’s only been a few weeks since Google released its most powerful lightweight LLM, Gemma 2. I have been experimenting with it on different platforms and thought about sharing five ways you can try Gemma 2 for free.

Before we get into the list, here’s a brief description of Gemma 2 for those who are not yet familiar with it.

What is Gemma 2?

Gemma 2 is the latest open-weights language model released by Google for researchers and developers worldwide. It comes in two variants: 9 billion and 27 billion parameters.

The 9B model was trained on approximately 8 trillion tokens, while the 27B version was trained on about 13 trillion tokens of web data, code, and math.

These lightweight models are designed to run efficiently on various hardware, including Nvidia GPUs and Google’s TPUs, making them suitable for both cloud and on-device applications.

Okay, now let’s get into the list.

1. Vertex AI Studio

Google Vertex AI Studio is a powerful tool for prototyping and customizing generative AI models within the Google Cloud ecosystem.

It provides access to Google’s state-of-the-art models, offers multimodal capabilities, and integrates seamlessly with other Google Cloud services for end-to-end machine learning workflows.

Try Gemma 2 in Vertex AI Studio
Image by Jim Clyde Monge

Pricing for generative AI services in Vertex AI varies depending on the specific foundation models and APIs used.

Pricing for generative AI services in Vertex AI varies depending on the specific foundation models and APIs used.
Image by Jim Clyde Monge

New customers can get up to $300 in free credits to try Vertex AI and other Google Cloud products

2. Ollama

Ollama is an open-source project designed to simplify the process of working with large language models (LLMs). It provides a user-friendly platform for running, customizing, and managing various LLMs, including popular models like Llama 3, Phi 3, Mistral, and Gemma 2.

Image by Jim Clyde Monge

Download Ollama here and run the model via the command:

ollama run gemma2

Image by Jim Clyde Monge

3. HuggingChat

HuggingChat is an open-source AI chatbot developed by Hugging Face, a leading platform for artificial intelligence and natural language processing tools.

You can easily access HuggingChat by visiting HuggingFace.co/chat. Then select the current model to “google/gemma-2–27b-it.”

Gemma 2 in HuggingChat
Image by Jim Clyde Monge

You can enable web search to complement the model’s answers with information from the internet.

However, HuggingChat may experience server load issues during high-traffic periods, which can result in slow loading times or temporary unavailability.

Additionally, there is a maximum token limit of 1,512 for the bot’s responses, which can sometimes lead to incomplete answers. I would suggest using Vertex Studio or Ollama instead.

4. Fireworks AI

Fireworks AI is a platform that specializes in optimizing and managing machine learning models at scale, particularly focusing on generative AI for product innovation.

It hosts over 100 state-of-the-art AI models, including large language models (LLMs) and image generation models such as Llama 3, Mixtral MoE 8x7B and 8x22B, and Stable Diffusion 3.

Under the list of model cards, find Gemma 2 and start chatting with it.

gemma 2 in fireworks AI
Image by Jim Clyde Monge

Developers can access Fireworks AI’s models through APIs that are compatible with OpenAI’s interface, making it easy to integrate and experiment with different models.

gemma 2 in fireworks AI API access
Image by Jim Clyde Monge

Fireworks AI can also be accessed through VS Code Extensions such as CodeGPT. Right now, Gemma-2 is not yet accessible in CodeGPT, but I’ll write a story once it’s out.

5. Nvidia NIM

Nvidia NIM (NVIDIA Inference Microservices) is a set of easy-to-use microservices designed to accelerate the deployment of AI models, particularly foundation models, on any cloud or data center infrastructure.

NVIDIA provides a model catalog where you can explore and try various AI models, including generative AI models. This catalog allows you to test models before deploying them in your applications.

For example, if you want to test the Gemma-2–27B model card, head over to this link and start testing the model for free.

Gemma 2 in Nvidia NIM
Image by Jim Clyde Monge

Below the chat field, you can adjust several parameters like the temperature, tokens, and stop/bad keywords.

Gemma 2 in Nvidia NIM parameter edit
Image by Jim Clyde Monge

It’s completely free to use; you don’t even have to log in or create an account.

Final Thoughts

Despite Google’s terrible track record when it comes to language model releases, Gemma 2 is actually a decent model for its size.

Last time I tried Gemma 1.1, the results were not good. It was performing worse than Llama 2. Now with Gemma 2 9B, not only did I get results that are on par with Llama 3 8B, but sometimes even better.

If you’re working on a project that would require lightweight models, I would highly recommend using Gemma 2.

Get your brand or product featured on Jim Monge's audience