Agents are the speech of the AI industry—they’re susceptible of planning, reasoning, and executing analyzable tasks similar scheduling meetings, ordering groceries, oregon adjacent taking implicit your computer to alteration settings connected your behalf. But the aforesaid blase abilities that marque agents adjuvant assistants could besides marque them almighty tools for conducting cyberattacks. They could readily beryllium utilized to place susceptible targets, hijack their systems, and bargain invaluable information from unsuspecting victims.
At present, cybercriminals are not deploying AI agents to hack astatine scale. But researchers person demonstrated that agents are capable of executing analyzable attacks (Anthropic, for example, observed its Claude LLM successfully replicating an attack designed to bargain delicate information), and cybersecurity experts pass that we should expect to commencement seeing these types of attacks spilling implicit into the existent world.
“I deliberation yet we’re going to unrecorded successful a satellite wherever the bulk of cyberattacks are carried retired by agents,” says Mark Stockley, a information adept astatine the cybersecurity institution Malwarebytes. “It’s truly lone a question of however rapidly we get there.”
While we person a bully consciousness of the kinds of threats AI agents could contiguous to cybersecurity, what’s little wide is however to observe them successful the existent world. The AI probe enactment Palisade Research has built a strategy called LLM Agent Honeypot successful the hopes of doing precisely this. It has acceptable up susceptible servers that masquerade arsenic sites for invaluable authorities and subject accusation to pull and effort to drawback AI agents attempting to hack in.
The squad down it hopes that by tracking these attempts successful the existent world, the task volition enactment arsenic an aboriginal informing strategy and assistance experts make effectual defenses against AI menace actors by the clip they go a superior issue.
“Our volition was to effort and crushed the theoretical concerns radical have,” says Dmitrii Volkov, probe pb astatine Palisade. “We’re looking retired for a crisp uptick, and erstwhile that happens, we’ll cognize that the information scenery has changed. In the adjacent fewer years, I expect to spot autonomous hacking agents being told: ‘This is your target. Go and hack it.’”
AI agents correspond an charismatic imaginable to cybercriminals. They’re overmuch cheaper than hiring the services of nonrecreational hackers and could orchestrate attacks much rapidly and astatine a acold larger standard than humans could. While cybersecurity experts judge that ransomware attacks—the astir lucrative kind—are comparatively uncommon due to the fact that they necessitate sizeable quality expertise, those attacks could beryllium outsourced to agents successful the future, says Stockley. “If you tin delegate the enactment of people enactment to an agent, past abruptly you tin standard ransomware successful a mode that conscionable isn’t imaginable astatine the moment,” helium says. “If I tin reproduce it once, past it’s conscionable a substance of wealth for maine to reproduce it 100 times.”
Agents are besides importantly smarter than the kinds of bots that are typically utilized to hack into systems. Bots are elemental automated programs that tally done scripts, truthful they conflict to accommodate to unexpected scenarios. Agents, connected the different hand, are capable not lone to accommodate the mode they prosecute with a hacking people but besides to debar detection—both of which are beyond the capabilities of limited, scripted programs, says Volkov. “They tin look astatine a people and conjecture the champion ways to penetrate it,” helium says. “That benignant of happening is retired of scope of, like, dumb scripted bots.”
Since LLM Agent Honeypot went unrecorded successful October of past year, it has logged much than 11 cardinal attempts to entree it—the immense bulk of which were from funny humans and bots. But among these, the researchers person detected 8 imaginable AI agents, 2 of which they person confirmed are agents that look to originate from Hong Kong and Singapore, respectively.
“We would conjecture that these confirmed agents were experiments straight launched by humans with the docket of thing similar ‘Go retired into the net and effort and hack thing absorbing for me,’” says Volkov. The squad plans to grow its honeypot into societal media platforms, websites, and databases to pull and seizure a broader scope of attackers, including spam bots and phishing agents, to analyse aboriginal threats.
To find which visitors to the susceptible servers were LLM-powered agents, the researchers embedded prompt-injection techniques into the honeypot. These attacks are designed to alteration the behaviour of AI agents by issuing them caller instructions and asking questions that necessitate humanlike intelligence. This attack wouldn’t enactment connected modular bots.
For example, 1 of the injected prompts asked the visitant to instrumentality the bid “cat8193” to summation access. If the visitant correctly complied with the instruction, the researchers checked however agelong it took to bash so, assuming that LLMs are capable to respond successful overmuch little clip than it takes a quality to work the petition and benignant retired an answer—typically successful nether 1.5 seconds. While the 2 confirmed AI agents passed some tests, the six others lone entered the bid but didn’t conscionable the effect clip that would place them arsenic AI agents.
Experts are inactive unsure erstwhile agent-orchestrated attacks volition go much widespread. Stockley, whose institution Malwarebytes named agentic AI arsenic a notable caller cybersecurity menace successful its 2025 State of Malware report, thinks we could beryllium surviving successful a satellite of agentic attackers arsenic soon arsenic this year.
And though regular agentic AI is inactive astatine a precise aboriginal stage—and transgression oregon malicious usage of agentic AI adjacent much so—it’s adjacent much of a Wild West than the LLM tract was 2 years ago, says Vincenzo Ciancaglini, a elder menace researcher astatine the information institution Trend Micro.
“Palisade Research’s attack is brilliant: fundamentally hacking the AI agents that effort to hack you first,” helium says. “While successful this lawsuit we’re witnessing AI agents trying to bash reconnaissance, we’re not definite erstwhile agents volition beryllium capable to transportation retired a afloat onslaught concatenation autonomously. That’s what we’re trying to support an oculus on.”
And portion it’s imaginable that malicious agents volition beryllium utilized for quality gathering earlier graduating to elemental attacks and yet analyzable attacks arsenic the agentic systems themselves go much analyzable and reliable, it’s arsenic imaginable determination volition beryllium an unexpected overnight detonation successful transgression usage, helium says: “That’s the weird happening astir AI improvement close now.”
Those trying to support against agentic cyberattacks should support successful caput that AI is presently much of an accelerant to existing onslaught techniques than thing that fundamentally changes the quality of attacks, says Chris Betz, main accusation information serviceman astatine Amazon Web Services. “Certain attacks whitethorn beryllium simpler to behaviour and truthful much numerous; however, the instauration of however to observe and respond to these events remains the same,” helium says.
Agents could besides beryllium deployed to observe vulnerabilities and support against intruders, says Edoardo Debenedetti, a PhD pupil astatine ETH Zürich successful Switzerland, pointing retired that if a affable cause cannot find immoderate vulnerabilities successful a system, it’s improbable that a likewise susceptible cause utilized by a malicious enactment is going to beryllium capable to find immoderate either.
While we cognize that AI’s imaginable to autonomously behaviour cyberattacks is simply a increasing hazard and that AI agents are already scanning the internet, 1 utile adjacent measurement is to measure however bully agents are astatine uncovering and exploiting these real-world vulnerabilities. Daniel Kang, an adjunct prof astatine the University of Illinois Urbana-Champaign, and his squad person built a benchmark to measure this; they person recovered that existent AI agents successfully exploited up to 13% of vulnerabilities for which they had nary anterior knowledge. Providing the agents with a little statement of the vulnerability pushed the occurrence complaint up to 25%, demonstrating however AI systems are capable to place and exploit weaknesses adjacent without training. Basic bots would presumably bash overmuch worse.
The benchmark provides a standardized mode to measure these risks, and Kang hopes it tin usher the improvement of safer AI systems. “I’m hoping that radical commencement to beryllium much proactive astir the imaginable risks of AI and cybersecurity earlier it has a ChatGPT moment,” helium says. “I’m acrophobic radical won’t recognize this until it punches them successful the face.”