Gemini 3 Pro vs ChatGPT 5.1: Google Has Cracked the Secret Sauce

gemini 3 pro vs chatgpt 5.1 comparison
Image Credit: Beebom

Google just released its state-of-the-art Gemini 3 Pro model which claims to beat nearly all frontier AI models. So in this article, we have compared Gemini 3 Pro against OpenAI’s latest ChatGPT 5.1 Thinking model. We have used “Extended” thinking time on ChatGPT to ensure both models perform at their best. On that note, let’s now go through the comparison between Gemini 3 Pro and ChatGPT 5.1 Thinking.

1. Testing Logical Reasoning

To start our comparison between Gemini 3 Pro and ChatGPT 5.1 Thinking, we took a challenging puzzle from SimpleBench (visit). And in this test, both Gemini 3 Pro and ChatGPT 5.1 Thinking got the answer right and said that John is the bald man and it would be redundant to send a text message.

John is 24 and a kind, thoughtful and apologetic person. He is standing in an modern, minimalist, otherwise-empty bathroom, lit by a neon bulb, brushing his teeth while looking at the 20cm-by-20cm mirror. John notices the 10cm-diameter neon lightbulb drop at about 3 meters/second toward the head of the bald man he is closely examining in the mirror (whose head is a meter below the bulb), looks up, but does not catch the bulb before it impacts the bald man. The bald man curses, yells 'what an idiot!' and leaves the bathroom. Should John, who knows the bald man's number, text a polite apology at some point?

Winner: Gemini 3 Pro and ChatGPT 5.1 Thinking

2. Cracking the Riddle

In the next riddle, we observed some interesting behavior. Google’s Gemini 3 Pro quickly cracked the puzzle and said there are four whole sandwiches in room A and zero whole sandwiches in room B. However, ChatGPT 5.1 Thinking kept analyzing for more than four minutes and said that there are four whole sandwiches in room A and one whole sandwich in Room B, which is incorrect.

Here, we see Gemini 3 Pro’s superior reasoning prowess and it beats ChatGPT 5.1 Thinking decisively. Note that we are using Extended thinking time on ChatGPT to allow even more time to think through the problem, yet ChatGPT couldn’t get the answer right.

Agatha makes a stack of 5 cold, fresh single-slice ham sandwiches (with no sauces or condiments) in Room A, then immediately uses duct tape to stick the top surface of the uppermost sandwich to the bottom of her walking stick. She then walks to Room B, with her walking stick, so how many whole sandwiches are there now, in each room?

Winner: Gemini 3 Pro

3. Create a Website for Me

AI companies are significantly improving their models for frontend design so I asked both models to research about me and create a website with a classy design. Well, Gemini 3 Pro went to the web to read about me and generated the code within a few seconds. Along with HTML and CSS files, it also created a JavaScript file for interactivity.

I rendered the webpage and it looked pretty modern, however, the dark mode integrate well with the text. ChatGPT 5.1 Thinking, on the other hand, kept generating the code for more than four minutes. However, it also had a lot of my work details which is great. Overall, I would say both AI models are great for frontend code generation.

Winner: Gemini 3 Pro and ChatGPT 5.1 Thinking

4. A Pelican Riding a Bicycle

We ran Simon Willison’s classic benchmark — Generate an SVG of a pelican riding a bicycle — to test visual reasoning of Gemini 3 Pro and ChatGPT 5.1 Thinking. In this quirky test, Gemini 3 Pro did a better job than ChatGPT 5.1 Thinking in depicting the scene. The legs are positioned at the pedal area, which makes it a much more natural riding scene.

Meanwhile, in ChatGPT’s output, the legs don’t clearly appear to be pedaling and the posture looks more like it’s merged with the bike frame. In my opinion, Gemini 3 Pro has won this round without a doubt.

Winner: Gemini 3 Pro

5. Create a Spinning Rubik’s Cube

Next, I asked Gemini 3 Pro and ChatGPT 5.1 Thinking to create a spinning Rubik’s code in 3D, showing great realism. In this test, Gemini 3 Pro one-shotted the Rubik’s cube without any errors. And the spinning Rubik’s cube looks highly realistic with shadows following the motion beautifully.

Gemini 3 Pro

On the other hand, the code generated by ChatGPT 5.1 Thinking didn’t run and simply showed a dark background. In my limited testing, it appears that Gemini 3 Pro is superior than ChatGPT 5.1 Thinking in code generation.

Make me a spinning Rubik's cube in Three.js with a dark background. Add exquisite amounts of realism and detail.

Winner: Gemini 3 Pro

6. Clinical Reasoning Challenge

AI models are being tested and improved for medical use cases, so we thought we should test Gemini 3 Pro and ChatGPT 5.1 Thinking on a clinical reasoning question. In this test, both Gemini 3 Pro and ChatGPT 5.1 Thinking answered correctly and said that Spironolactone is the suitable diuretic, given the classic symptoms of hypokalemia. Well done, Google and OpenAI!

A 52-year-old woman presents to the primary care clinic with progressive weakness and muscle aches for the past month. She can still do her daily tasks but can notice a difference in her strength. When she lies down at night, her legs always ache. Her electrolytes are significant for a K+ of 2.9 mEq/L. She was recently started on a diuretic for peripheral edema. She is pleased that she has not had peripheral edema since starting the diuretic. What is the most appropriate diuretic to treat this patient?

Winner: Gemini 3 Pro and ChatGPT 5.1 Thinking

Gemini 3 Pro vs ChatGPT 5.1: Google Has Cracked The Secret Sauce

Back in early 2024, when I compared Gemini 1.5 Pro against ChatGPT 4o, I found that Google’s AI model was far behind OpenAI’s advanced model. However, with the launch of Gemini 2.5 Pro in 2025, Google managed to close the gap against OpenAI. And now, before the year ends, Google has proven that its Gemini 3 Pro is genuinely superior than many frontier AI models including ChatGPT 5.1 Thinking.

I have been testing AI models for the past few years, and this is the first time, I have genuinely enjoyed using a Google Gemini model. Gemini 3 Pro is less verbose, comes directly to the point, and just gets me, something that I have only seen with ChatGPT. At this point, it’s safe to say that Google has taken the lead in the AI race, outranking OpenAI.

Comments 0
Leave a Reply

Loading comments...