Which of your AI agents are Exposed Giants?

2026-06-04

A four-quadrant risk chart rendered as glowing nodes, with a dense cluster of large nodes in the high-reach low-defense corner labeled Exposed Giants and a small bright cluster in the high-reach high-defense corner.

A June 3 study scored 100 production AI agents and found only 11 percent were both capable and well-defended. Here is the board memo for a Series B heading into Q3: not how many agents are running, but which ones could cause real damage and who owns that answer.

TLDR

A June 3 study scored 100 production AI agents and found only 11 percent are both capable and well-defended, while 40 percent pair broad power with thin defense and carry most of the risk. For a Series B board, the Q3 question shifts from how many agents are running to how many could cause real damage, and who owns that answer.

The headline your board saw

On June 3, a security research lab called Adversa AI published a study that is going to end up in a board deck this quarter. It scored 100 real, in-production AI agents on three plain questions: how exposed each agent is, how much damage it could do if something went wrong, and what defenses are genuinely in place. The number that traveled fastest, picked up by Help Net Security and SecurityWeek the same day, was this: only 11 percent of those agents are both capable and well-defended. Everything else trades power for safety in one direction or the other. By the time a statistic like that reaches a Series B board, it has usually lost the method and kept the worry. So here is what it actually says.

11%

of 100 production AI agents scored as both capable and well-defended (AIRQ, June 2026)

What it actually means

The study uses a framework called the AI Risk Quadrant. It plots every agent on two axes: how much reach it has, and how much defense sits behind that reach. Cross them and you get four boxes. An agent with broad reach and thin defense is an Exposed Giant. Broad reach with matching defense is a Fortified Leader. Narrow and well-guarded is a Tight Operator. Narrow and lightly guarded is a Humble Provider.

Here is the part that matters for risk planning. Forty percent of the agents landed in Exposed Giants, and that 40 percent holds 60 percent of the total risk in the sample. The danger is concentrated, not spread evenly. Ninety-eight percent of the agents carry what the researchers call the lethal trifecta: access to private data, exposure to untrusted content, and the ability to take outbound actions. Any agent with all three can, in the report’s words, be taken over by a single hostile document.

The AI Risk Quadrant, by reach and defense

Quadrant	Profile
Exposed Giant	Broad reach, thin defense (40% of agents, 60% of the risk)
Fortified Leader	Broad reach, matching defense (11%)
Tight Operator	Narrow reach, well-guarded
Humble Provider	Narrow reach, lightly guarded

"Only 11% qualify for the capable and defended quadrant; tool execution accounts for 76% of blast radius; 37% of the market is audited more than defended; and 83% of claimed AI agent defenses are not publicly verifiable."

SecurityWeek, summarizing the AIRQ Q2 2026 report, June 3, 2026

Two of those numbers deserve a second read. Thirty-seven percent of agents are, as the report puts it, audited more than defended: they keep careful logs of incidents they cannot actually prevent. And 83 percent of the security claims made about these agents cannot be independently verified. That second one is not a knock on any single vendor. It is a statement about the whole market right now.

Three questions your board will ask

Strip away the framework and a Series B board will land on three questions. Good ones.

First: which of our agents would land in Exposed Giants, and who owns that list? This is an inventory question before it is a security question. Most teams cannot name every agent running against their systems, let alone rank them by blast radius. The honest first answer is often a number with a shrug attached. That is fine, as long as the next sentence is a name and a date.

Second: if one of them misbehaves, can we actually stop it in time? The study found that a meaningful share of agents complete irreversible actions before monitoring even activates. A kill switch written into a risk register is not a kill switch. It is a sentence. The board wants to know the mechanism has been built and tested, not just documented.

Third: do we believe what we are told about security? With 83 percent of defense claims unverifiable, the calm posture is to treat an unverified claim as a zero and ask for evidence. The report’s own advice is refreshingly practical: stop trying to filter every input, because that fight is unwinnable, and spend the budget on the parts an operator does control, which are egress, identity, and irreversible actions.

The board question is no longer how many agents are running. It's how many of them would survive a single hostile document.

Key Insight

Risk across an agent fleet is concentrated, not spread evenly. A small group of high-reach, low-defense agents carries most of the danger, which means a short ranked list does more for safety than a blanket policy applied to everything at once.

The 60-second brief

If there is one minute with the board, say this. A new study scored 100 production AI agents, and only 11 percent were both capable and safe. The riskiest 40 percent carry 60 percent of the danger, so the work is to find our version of that 40 percent, name an owner for each, and confirm we can shut any of them down inside our own stated response window. We are not behind because we have exposed agents. Almost everyone does. We would be behind only if we could not name them or stop them. Give me two weeks and I will bring back the list.

What to watch

The encouraging part is that the controls this study points to are boring and buildable: an agent inventory ranked by blast radius, a tested stop mechanism, and a habit of asking for proof instead of promises. None of that requires a new platform. It requires someone to own the list. Watch for the major agent platforms to start shipping these containment controls as defaults over the next two quarters, which will make a ranked inventory easier to keep current. Until then, a spreadsheet is enough. The number that should change before the next board meeting is not the count of agents. It is how many of them the team can name and stop.

Sources

Only 11% of production agents pass the AI agent security bar - Help Net Security, 2026-06-03
AI Risk Quadrant for agents: AIRQ methodology and top 100 agents scored for attack, defense, blast radius - Adversa AI, 2026-06-03
Security of 100 AI Agents Tested and Ranked: What You Need to Know - SecurityWeek, 2026-06-03

Back to all insights