Gemini 3 Pro vs ChatGPT 5.1: Unveiling Google’s Secret Sauce

Google has just launched its groundbreaking Gemini 3 Pro model, claiming to outperform most AI models in the market today. This article dives into a comparison between Gemini 3 Pro and OpenAI’s latest ChatGPT 5.1 Thinking model. We used “Extended” thinking time for ChatGPT to ensure a fair evaluation. Let’s explore how these two AI giants stack up against each other.

Having tested AI models for years, I aim to provide you insights from this comparison. This is not just about numbers; it’s about the real-world applications and functionalities of these evolving technologies.

1. Testing Logical Reasoning

To kick off our comparison, we used a challenging puzzle sourced from SimpleBench (visit). Both Gemini 3 Pro and ChatGPT 5.1 Thinking arrived at the correct conclusion that John is the bald man and sending a text message would be redundant.

John is 24 and a kind, thoughtful and apologetic person. He is standing in a modern, minimalist bathroom with a neon bulb above him, brushing his teeth and looking at a mirror. John notices a neon lightbulb dropping towards the bald man he is observing in the mirror but doesn’t catch it before it falls. The bald man curses and leaves. Should John text a polite apology?

Winner: Both Gemini 3 Pro and ChatGPT 5.1 Thinking.

2. Cracking the Riddle

In another riddle, Gemini 3 Pro quickly concluded that there are four whole sandwiches in Room A and zero in Room B. However, ChatGPT 5.1 Thinking took over four minutes to arrive at the answer, incorrectly stating there was one sandwich in Room B.

gemini 3 pro answering a logical problem

Agatha made a stack of 5 cold, fresh single-slice ham sandwiches in Room A. She then used duct tape to stick the top sandwich to her walking stick and walked to Room B. How many whole sandwiches remain in each room?

Winner: Gemini 3 Pro.

3. Create a Website for Me

As AI models advance, their frontend design capabilities are improving. I asked both models to create a website for me. Gemini 3 Pro quickly gathered information and produced HTML, CSS, and even JavaScript within seconds.

website generated by chatgpt 5 — ChatGPT 5.1 Thinking

website frontend generated by gemini 3 pro — ChatGPT 5.1 Thinking

The webpage rendered beautifully in a modern format, seamlessly integrating dark mode. On the other hand, ChatGPT 5.1 Thinking took an extended time to generate the code but provided ample details about my work. Both models excel in frontend code generation.

Winner: Both Gemini 3 Pro and ChatGPT 5.1 Thinking.

4. A Pelican Riding a Bicycle

We put both models to the test using Simon Willison’s benchmark—creating an SVG of a pelican riding a bicycle. Gemini 3 Pro excelled, effectively depicting the scene with accurate leg positioning. ChatGPT 5.1’s output, however, showed the pelican’s legs merged with the bike, lacking the natural look.

pelican riding on bicycle made by gemini 3 pro — Gemini 3 Pro

pelican riding on bicycle made by chatgpt 5.1 — Gemini 3 Pro

In this matchup, Gemini 3 Pro takes the lead with a clearly defined visual representation.

Winner: Gemini 3 Pro.

5. Create a Spinning Rubik’s Cube

Finally, I tasked both models with creating a spinning Rubik’s cube in 3D, focusing on realism. Gemini 3 Pro delivered flawlessly, crafting a highly realistic cube that moved beautifully with shadows. In contrast, ChatGPT 5.1 failed to produce a working code and only showed a dark background.

Please generate a spinning Rubik's cube in Three.js with a dark background, ensuring exceptional realism.

Winner: Gemini 3 Pro.

6. Clinical Reasoning Challenge

To examine clinical reasoning capabilities, we presented both models with a medical scenario. Both Gemini 3 Pro and ChatGPT 5.1 correctly identified Spironolactone as the suitable diuretic, demonstrating their applicability in medical use cases.

gemini 3 pro answering a medical question

A 52-year-old woman has muscle weakness and an electrolyte reading of K+ at 2.9 mEq/L. Following her recent diuretic start, what is the most appropriate diuretic for treatment?

Winner: Both Gemini 3 Pro and ChatGPT 5.1 Thinking.

Gemini 3 Pro vs ChatGPT 5.1: Google Has Cracked The Secret Sauce

Back in early 2024, my comparison between Gemini 1.5 Pro and ChatGPT 4 revealed that Google’s AI trail was significantly behind OpenAI. Fast forward to 2025, with Gemini 2.5 Pro, Google closed that gap. Now, with the arrival of Gemini 3 Pro, it has definitively proven superiority over several frontline AI models, including ChatGPT 5.1 Thinking.

This experience has been a game-changer for me, as Gemini 3 Pro offers concise, to-the-point responses that resonate with user needs—something I’ve valued in ChatGPT. It’s safe to say Google is leading the race in AI innovation.

Curious to dive deeper? Explore more insights and content at Moyens I/O.