Here are 5 websites to try Google's most powerful lightweight model, Gemma 2 9B and 27B.
This article is originally posted in generativeai.pub
It’s only been a few weeks since Google released its most powerful lightweight LLM, Gemma 2. I have been experimenting with it on different platforms and thought about sharing five ways you can try Gemma 2 for free.
Before we get into the list, here’s a brief description of Gemma 2 for those who are not yet familiar with it.
Gemma 2 is the latest open-weights language model released by Google for researchers and developers worldwide. It comes in two variants: 9 billion and 27 billion parameters.
The 9B model was trained on approximately 8 trillion tokens, while the 27B version was trained on about 13 trillion tokens of web data, code, and math.
These lightweight models are designed to run efficiently on various hardware, including Nvidia GPUs and Google’s TPUs, making them suitable for both cloud and on-device applications.
Okay, now let’s get into the list.
Google Vertex AI Studio is a powerful tool for prototyping and customizing generative AI models within the Google Cloud ecosystem.
It provides access to Google’s state-of-the-art models, offers multimodal capabilities, and integrates seamlessly with other Google Cloud services for end-to-end machine learning workflows.
Pricing for generative AI services in Vertex AI varies depending on the specific foundation models and APIs used.
New customers can get up to $300 in free credits to try Vertex AI and other Google Cloud products
Ollama is an open-source project designed to simplify the process of working with large language models (LLMs). It provides a user-friendly platform for running, customizing, and managing various LLMs, including popular models like Llama 3, Phi 3, Mistral, and Gemma 2.
Download Ollama here and run the model via the command:
HuggingChat is an open-source AI chatbot developed by Hugging Face, a leading platform for artificial intelligence and natural language processing tools.
You can easily access HuggingChat by visiting HuggingFace.co/chat. Then select the current model to “google/gemma-2–27b-it.”
You can enable web search to complement the model’s answers with information from the internet.
However, HuggingChat may experience server load issues during high-traffic periods, which can result in slow loading times or temporary unavailability.
Additionally, there is a maximum token limit of 1,512 for the bot’s responses, which can sometimes lead to incomplete answers. I would suggest using Vertex Studio or Ollama instead.
Fireworks AI is a platform that specializes in optimizing and managing machine learning models at scale, particularly focusing on generative AI for product innovation.
It hosts over 100 state-of-the-art AI models, including large language models (LLMs) and image generation models such as Llama 3, Mixtral MoE 8x7B and 8x22B, and Stable Diffusion 3.
Under the list of model cards, find Gemma 2 and start chatting with it.
Developers can access Fireworks AI’s models through APIs that are compatible with OpenAI’s interface, making it easy to integrate and experiment with different models.
Fireworks AI can also be accessed through VS Code Extensions such as CodeGPT. Right now, Gemma-2 is not yet accessible in CodeGPT, but I’ll write a story once it’s out.
Nvidia NIM (NVIDIA Inference Microservices) is a set of easy-to-use microservices designed to accelerate the deployment of AI models, particularly foundation models, on any cloud or data center infrastructure.
NVIDIA provides a model catalog where you can explore and try various AI models, including generative AI models. This catalog allows you to test models before deploying them in your applications.
For example, if you want to test the Gemma-2–27B model card, head over to this link and start testing the model for free.
Below the chat field, you can adjust several parameters like the temperature, tokens, and stop/bad keywords.
It’s completely free to use; you don’t even have to log in or create an account.
Despite Google’s terrible track record when it comes to language model releases, Gemma 2 is actually a decent model for its size.
Last time I tried Gemma 1.1, the results were not good. It was performing worse than Llama 2. Now with Gemma 2 9B, not only did I get results that are on par with Llama 3 8B, but sometimes even better.
If you’re working on a project that would require lightweight models, I would highly recommend using Gemma 2.
Software engineer, writer, solopreneur