- The Google DeepMind team has released a family of open-source models called Gemma. It comes in two sizes: 2B and 7B parameters.
- The models are trained on an English dataset and can be used on laptops for text summarization, text generation, reasoning, and Q&A.
- The models are licensed for commercial usage and distribution, however, developers have to adhere to Gemma's prohibited use policy.
After releasing Gemini 1.0 Ultra and Gemini 1.5 Pro over the last couple of weeks, Google has now launched a new family of small open-source models called Gemma. It comes in two variants, one with 2B parameters and another with 7B parameters.
These open-source models come with a commercial license, which means they can be freely used or modified for commercial purposes, unlike Gemini which is a proprietary model. The company says that despite their small size, the Gemma models are capable and built on research and technology used for creating Gemini models.
Gemma Models Can Run on Your Laptop Easily
Google says the Gemma open-source models are quite small, and they can be easily deployed on laptops or desktops. They have been trained on English datasets, including web documents, code, and mathematics.
Gemma models are well-suited for text summarization, generation, reasoning, Q&A, and more. As for the training dataset, Google says Gemma models are trained on a total of 6 trillion tokens.
While the models are open-source, Google has done extensive testing of the models for safety, bias, and risks. CSAM (Child Sexual Abuse Material) filter has been rigorously applied to remove any harmful content. Apart from that, many sensitive data filtering has been applied to exclude personal information from the models.
Google also offers a Responsible Generative AI Toolkit for developers to use the model responsibly. The family of Gemma models is open-source, but it has some prohibited use policy that bars devs from using it for “dangerous, illegal, or malicious activities“, among other things.
Coming to benchmarks, the Gemma 2B model has scored 42.3 in the MMLU test and the 7B model has scored 64.3. In the HellaSwag test, the 2B model got 71.4 and the 7B model scored 81.2. In contrast, Microsoft’s 2.7B Phi-2 model scored 56.7 in the MMLU test and Meta’s Llama 2 (7B) scored 45.3. However, Google’s own Gemini Nano 2 (3.2B) model scored 55.8 in the same test.
Overall, I believe Google has taken a good step to release open-source models for research and to foster innovation. You can start using Gemma models on Kaggle or go through the official PyTorch implementation of Gemma models on GitHub (visit). You can also check out Gemma on Vertex AI (visit).
In the coming day, I will be test-driving this open-source model to see how it stacks against other popular open-source models out there. So stay tuned for a more comprehensive, hands-on test of Gemma, Mixtral, and other LLMs.