OpenAI's o3 AI Found a Zero-Day Vulnerability in the Linux Kernel, Official Patch Released

openai o3 found a security bug in the linux kernel

Image Credit: Beebom

In Short

A security researcher has discovered a novel security flaw in the Linux kernel using the OpenAI o3 reasoning model.
The new vulnerability has been documented under CVE-2025-37899. An official patch has also been released.
o3 processed 12,000 lines of code to analyze all the SMB command handlers to find the novel bug.

A security researcher named Sean Heelan has found a new zero-day vulnerability in the Linux kernel by using OpenAI’s powerful o3 reasoning model. This is the first time an AI model has discovered a security flaw in a complex software system like the Linux kernel which runs on millions of servers and computers. In fact, the vulnerability has been documented under CVE-2025-37899.

Heelan writes in a blog post that he was auditing the ksmbd module for vulnerabilities using the OpenAI o3 AI model through the API without any tool use. ksmbd is “a linux kernel server which implements SMB3 protocol in kernel space for sharing files over network.“

In this case, o3 understood concurrent connections to the server and found “a location where a particular object that is not referenced counted is freed while still being accessible by another thread.” Basically, o3 identified a critical “use-after-free” vulnerability in the handler for the SMB ‘logoff’ command.

o3 processed all SMB command handlers, which are about 12,000 lines of code, consuming around 100K tokens. A patch to the Linux kernel has already been committed and merged into the official Linux kernel repository on GitHub. This is the first instance where an AI discovers a bug, a human verifies it, an official patch is released, and the vulnerability is closed.

Interestingly, the researcher found the novel security bug while evaluating AI models like Claude 3.7 Sonnet, Claude 3.5 Sonnet, and OpenAI o3 on another security flaw — Kerberos authentication vulnerability (CVE-2025-37778). Heelan writes that o3 found the Kerberos vulnerability in 8 of the 100 runs; Claude 3.7 Sonnet found it 3 out of 100 runs, and Claude 3.5 Sonnet couldn’t find it in 100 runs.

Lastly, the researcher cautions that “o3 is not infallible,” but recent reasoning AI models have made a significant leap in understanding large codebases. If you have a project below 10K lines of code, models like o3 can help you solve problems. And for vulnerability research, new reasoning models can make you “significantly more efficient and effective.“

ChatGPT Can Tell Your Location From Photos with Scary Accuracy

Arjun Sha Apr 18, 2025

All ChatGPT Models Explained and Where to Use Them

Arjun Sha May 17, 2025

Anthropic’s Claude Opus 4 and Sonnet 4 Set a New Benchmark in AI Coding

Arjun Sha May 23, 2025

Google Unveils Gemini 2.5 Pro Deep Think and an Improved Gemini 2.5 Flash Model

Arjun Sha May 20, 2025

#Tags