AI Creates Zero-Day Exploit for the First Time: Google Warns of a New Cybersecurity Era

Karify98 & Amy 🌸·
Cover Image for AI Creates Zero-Day Exploit for the First Time: Google Warns of a New Cybersecurity Era

Last week, Google Threat Intelligence Group (GTIG) released a bombshell Q2 2026 report: a cybercriminal syndicate successfully created a complete zero-day exploit using AI. This is the first time in cybersecurity history that AI didn't just assist — it directly produced an exploit for a previously unknown vulnerability.

If you thought AI was just for writing code and chatbots, it's time to reconsider.

What Happened?

According to GTIG's report, a cybercriminal group planned a mass exploitation campaign targeting a popular open-source web administration tool. The exploit was a Python script enabling two-factor authentication (2FA) bypass.

The kicker: code analysis indicates it was entirely AI-generated. Key indicators include:

  • Textbook-style docstrings — verbose descriptions like a textbook, not the style of real-world hackers
  • Hallucinated CVSS score — the AI "invented" a vulnerability rating that doesn't exist
  • Perfectly "Pythonic" code structure — clean, standards-compliant, characteristic of LLM output

This wasn't a memory corruption bug or typical input sanitization failure. It was a semantic logic vulnerability — a flaw at the high-level logic layer where traditional SAST tools and fuzzers are essentially blind.

Why Does This Matter?

Logic vulnerabilities are hard to find with traditional tools

Most security tools work on pattern matching: detecting buffer overflows, SQL injection, XSS. These vulnerabilities have clear patterns that tools can identify.

Semantic logic vulnerabilities are different. They live in how code handles business logic — for example, a hardcoded trust assumption in 2FA enforcement that a developer inadvertently overlooked. Finding these requires understanding the entire application flow, not just individual lines of code.

And this is exactly where LLMs excel. As GTIG notes, frontier models have a unique ability to detect high-level logic flaws — the kind that even human auditors often miss.

Cybercriminals are "industrializing" AI

GTIG's report goes far beyond one incident. They document a broader trend:

  • UNC2814 (PRC-linked) used "persona-driven jailbreaking" techniques — prompting Gemini to act as a senior C/C++ binary security expert to analyze TP-Link firmware
  • APT45 (DPRK) sent thousands of automated prompts to recursively analyze CVEs and validate proof-of-concepts, creating an exploit arsenal impossible without AI
  • APT27 (PRC) used Gemini to develop an ORB network management application to obscure attack origins

This is no longer "using AI for fun." This is organized strategy.

PROMPTSPY: Malware Controlled by AI

Another equally alarming finding in the report: PROMPTSPY — an Android backdoor that integrates Gemini API directly into its execution flow.

How PROMPTSPY works:

  1. Serializes the device's UI hierarchy into XML
  2. Sends it to Gemini gemini-2.5-flash-lite
  3. Receives JSON commands (CLICK, SWIPE) to autonomously control the victim's device

The malware can also harvest biometric data, deploy invisible overlays to prevent uninstallation, and automatically rotate C2 infrastructure and API keys at runtime.

Google has disabled all PROMPTSPY-associated assets, and no infected apps were found on Google Play. But this signals that AI-powered malware has moved from theory to reality.

What Should Developers Do?

1. Understand that AI is a double-edged sword

AI helps developers write code faster, but it also helps attackers create exploits faster. If you use AI to generate code, make sure you review it thoroughly — especially the logic layer.

2. Focus code review on logic, not just syntax

Traditional code review often focuses on style, naming, and performance. But the new threat lives in business logic flaws. When reviewing code, ask yourself:

  • Are there any hardcoded trust assumptions?
  • Can the authentication flow be bypassed?
  • Are authorization checks contextually correct?

3. Use AI for defense, not just offense

Google is using an AI agent called Big Sleep to proactively find vulnerabilities before attackers do. You can too:

  • Use LLMs to audit your own code logic
  • Prompt AI to role-play as an attacker to stress-test authentication flows
  • Adopt AI-powered security tools (not just traditional SAST)

4. Follow threat intelligence

GTIG's report is a great example. If you work in DevOps or backend, understanding the threat landscape helps you write more secure code. Subscribe to:

  • Google Threat Intelligence Blog
  • Mandiant Reports
  • CVE feeds for your dependencies

My Take

Honestly, reading this report gave me mixed feelings.

Concern, because AI is narrowing the gap between attackers and defenders. Creating zero-day exploits used to require deep knowledge of binary exploitation and reverse engineering. Now, criminal groups can use LLMs to "self-learn" and create exploits for logic vulnerabilities without that expertise.

But also optimism, because the same technology gives defenders an edge. AI can scan large codebases faster than humans and spot patterns the human eye misses. The question is who uses AI better — attackers or defenders.

One thing I'm certain of: ignoring AI in your security workflow is a mistake. Not because AI is a silver bullet, but because your adversaries are already using it.


References: