- Microsoft has released its first model from the Phi-3 family: Phi-3 Mini which is trained on 3.8B parameters.
- Despite being a much smaller model, it beats Gemma 7B, Mistral 7B, and Llama 8B in the MMLU benchmark.
- The Phi-3 Small and Phi-3 Medium models outrank even OpenAI's GPT-3.5 model which is impressive.
While large language models (LLMs) bring the ability to understand and perform complex tasks, smaller models are equally important as they can be utilized locally on smartphones and PCs. And Microsoft seems to be developing impressive small language models. The latest Phi-3 Mini model has been introduced and it’s trained on 3.8B parameters.
There are other two models from the Phi-3 family: Phi-3 Small (7B) and Phi-3 Medium (14B), but they have not been released yet. As for the smallest Phi-3 Mini model, it performs better than Meta’s Llama 3 8B model, Google’s Gemma 7B model and Mistral 7B model in the MMLU benchmark. In fact, despite its small size, it matches the performance of Mixtral 8x7b which is remarkable.
In HumanEval, the smallest Phi-3 Mini model performs far better than Gemma 7B, and Mistral 7B. It seems Microsoft has put a lot of effort into creating a powerful small model for running on smartphones and PCs locally. The best part is that the upcoming Phi-3 Small and Phi-3 Medium models beat OpenAI’s GPT-3.5 model, Mixtral 8x7b, and Llama 3 8B. That’s pretty impressive if you ask me.
Microsoft says Phi-3 achieves great performance due to clean dataset consisting of heavily-filtered web data and synthetic data. The model is further checked for safety, harm, and robustness. The Phi-3 model seems to be the new king as far as smaller models are concerned. I am excited to test the model and check whether it beats Anthropic’s Haiku model which is the smallest model in the Claude 3 family.
So are you excited to test out Microsoft’s new Phi-3 Mini model? Let us know in the comments below. Meanwhile, you can check out how to run Google’s Gemma model on your PC locally.