Microsoft Launches Phi-4 Reasoning AI Models to Rival DeepSeek R1

microsoft announces phi 4 reasoning ai models

Image Credit: Microsoft

In Short

Microsoft has released Phi-4 reasoning AI models, which are trained on 14B and 3.8B parameters.
Despite their small size, Phi-4 reasoning models rival much larger models like DeepSeek R1 and o3-mini.
Microsoft says Phi-4 reasoning models can run on Windows Copilot+ PCs, thanks to their small size.

Microsoft has launched three new AI reasoning models including Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning. These are small language models, designed for edge devices like Windows PCs and mobile devices. The Phi-4-reasoning AI model is trained on 14 billion parameters and can perform complex reasoning tasks.

The Phi-4-reasoning-plus model uses the same base model, but it uses more inference-time compute, nearly 1.5x more tokens than Phi-4-reasoning to deliver higher accuracy. Despite being much smaller in size, Phi-4-reasoning models rival larger models such as DeepSeek R1 671B and o3-mini.

In the GPQA benchmark, Phi-4-reasoning-plus-14B model achieves 69.3% while the o3-mini scores 77.7%. Next, in the AIME 2025 test, Phi-4-reasoning-plus-14B gets 78%, and o3-mini achieves 82.5%. It goes on to show that Microsoft’s small model comes very close to flagship reasoning models, which are much larger in size.

Microsoft says Phi-4 reasoning models are trained via supervised fine-tuning “on carefully curated reasoning demonstrations from OpenAI o3-mini.” Further, Microsoft writes, “The model demonstrates that meticulous data curation and high-quality synthetic datasets allow smaller models to compete with larger counterparts.“

Apart from that, the smaller Phi-4-mini-reasoning model, trained on just 3.8B parameters, outperforms many 7B and 8B models. In benchmarks like AIME 24, MATH 500, and GPQA Diamond, the Phi-4-mini-reasoning-3.8B model delivers competitive scores, nearly matching o1-mini. The Phi-4-mini model has been “fine-tuned with synthetic data generated by Deepseek-R1 model.”

Microsoft’s Phi models are already being locally used on Windows Copilot+ PCs, and they leverage the built-in NPU. It will be interesting to see how the Phi-4 reasoning models improve the on-device AI performance.

10 Best ChatGPT Alternatives in 2025 (Free & Paid)

Arjun Sha Jan 28, 2025

How to Run DeepSeek R1 Locally on Windows, macOS, Android & iPhone

Arjun Sha Feb 11, 2025

How to Install and Run LLMs Locally on Android Phones

Arjun Sha Apr 22, 2024

Microsoft AI Chief Apparently Yelled at OpenAI Employees; Cracks Starting to Appear?

Arjun Sha Oct 19, 2024

#Tags