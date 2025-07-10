Home > News > Elon Musk’s Grok 4 AI Models Set New Benchmark Records

Elon Musk’s Grok 4 AI Models Set New Benchmark Records

Arjun Sha
Comments 0
elon musk launched grok 4 ai models
Image Credit: xAI via X
In Short
  • Elon Musk's xAI company has launched two new AI models called Grok 4 and Grok 4 Heavy.
  • New Grok 4 models record groundbreaking benchmark results, outranking OpenAI o3, Gemini 2.5 Pro, and Claude Opus 4.
  • Grok 4 achieved 15.9% on the novel ARC-AGI-2 benchmark, becoming the state-of-the-art reasoning AI model.

Elon Musk’s AI firm, xAI, released its frontier Grok 4 AI models with record-breaking benchmark numbers. There are two new AI models — Grok 4 and Grok 4 Heavy — and both are reasoning AI models. Along with the new models, xAI announced a new subscription plan called SuperGrok Heavy, which costs $300 per month and offers access to the Grok 4 Heavy model.

Talking about benchmarks, Grok 4 outperforms all leading AI models from OpenAI, Google, and Anthropic. In GPQA, Grok 4 scored 87.5% and Grok 4 Heavy achieved 88.9%. In the AIME 2025 test, Grok 4 Heavy got a full 100% accuracy.

grok 4 benchmarks
Image Credit: xAI via X

And in the challenging Humanity’s Last Exam benchmark, Grok 4 Heavy achieved 44.4% and Grok 4 got 38.6%, with tool support. In this test, Gemini 2.5 Pro scored 26.9% and OpenAI’s o3 scored 24.9% with tools. It shows that Grok 4 is currently the state-of-the-art reasoning AI model.

Most notably, in the newly launched ARC-AGI-2 benchmark, Grok 4 achieved a record-breaking 15.9%, which is the highest score to date. It scored double that of Claude Opus 4 and OpenAI o3. This makes Grok 4 the frontier AI model, among all the AI models released by any AI lab so far. And in the older ARC-AGI-1 benchmark, Grok 4 achieved 66.7%, again higher than the publicly available OpenAI o3-pro and o4-mini.

xAI says Grok 4 Heavy is the largest AI model by the company, and it can work with multiple agents to solve a problem in parallel. Musk also said that an AI coding model will be released in August, a multi-modal agent is planned for September, and we may finally see a video generation model in October.

Overall, xAI has again proved that it’s one of the prominent AI labs training foundational AI models and stands to challenge all major AI players around the world.

Related Articles
X is All Set to Allow AI Chatbots to Write Community Notes
Anshuman Jain Jul 2, 2025
xAI Responds to Grok’s “White Genocide” Posts, Blames Unauthorized Modification
Anshuman Jain May 16, 2025
Elon Musk’s Grok AI Can See the World and Talk in Real-Time
Arjun Sha Apr 23, 2025
xAI’s Grok Adds a Memory Feature That Can Remember Your Conversations
Anshuman Jain Apr 17, 2025
#Tags
#AI#featured

Arjun Sha

Passionate about Windows, ChromeOS, Android, security and privacy issues. Have a penchant to solve everyday computing problems.

Comments 0
Leave a Reply

Loading comments...