AI & Cybersecurity / Offensive AI / AI-Assisted Malware

Offensive AI

AI-Assisted Malware

Offensive AI

Using AI to generate, mutate, and obfuscate malicious code faster than traditional AV signatures can track.

AI-assisted malware is not, for the most part, malware written entirely by an AI from scratch. It is malware whose authors used a language model to accelerate or transform parts of the development process: generating obfuscated variants of existing code, rewriting payloads to evade detection, drafting evasion techniques, or, in the most novel cases, embedding live calls to an LLM inside the malware itself so that the malicious behavior is generated at runtime rather than shipped in the binary. Each of these uses changes a different part of the defender's job, and understanding which use is which matters far more than panicking about "AI malware" as a single category.

What you'll learn

Key takeaways from this topic.
  • Distinguish between AI used to develop malware and AI embedded inside malware at runtime.
  • Explain how LLM-generated polymorphism degrades signature-based detection.
  • Recognize the actual threat level of current AI-assisted malware versus the marketing-driven narrative.

At a glance

Fast mental model before you dive in.
Core concepts
  • LLM code generation
  • Polymorphic payloads
  • Runtime LLM calls
Techniques
  • Source-level mutation
  • Living-off-the-land execution
  • Underground LLM platforms
Defenses
  • Behavioral detection
  • API call monitoring
  • Egress filtering

Core idea

There are three distinct ways AI shows up in modern malware, and they have very different security implications. The first is the most common: an attacker uses a general-purpose LLM, either a mainstream model accessed through jailbreaks or a purpose-built underground tool like WormGPT, to write or refactor parts of the malware code. The AI is just an unusually fast assistant. The resulting binary contains no AI component and behaves like any other piece of malware once it lands on a system.

The second use is at build time: the attacker uses an LLM to systematically generate many variants of the same malicious code, each functionally identical but textually different. This is automated polymorphism. The output is still a static binary, but the campaign produces a stream of variants fast enough to outpace signature-based detection.

The third use is the most novel and the most discussed: the malware itself calls out to an LLM at runtime to generate its own behavior on the fly. Samples like LameHug and MalTerminal, documented in 2025, query a hosted LLM during execution to produce the actual Windows commands they run for reconnaissance and data theft. The malicious logic is not in the binary, it is decided after the malware is already running. This is the genuinely new category, and it is also the rarest in the wild.

Understanding which category a given threat falls into determines how a defender should respond. AI-developed malware is detected the same way any malware is detected. AI-mutated variants require behavioral, not signature-based, detection. Runtime-LLM malware requires monitoring outbound network traffic to LLM endpoints and treating those calls as part of the threat surface.

How it works

For AI-developed malware, the workflow is straightforward. The attacker prompts an LLM to produce specific malicious functionality: a credential stealer for a particular browser, a keylogger that hides in a specific process, an injector that bypasses a particular EDR product. With safety guardrails removed (either through jailbreaks of mainstream models, or through purpose-built tools like WormGPT and FraudGPT that have no guardrails to begin with), the LLM produces working code in seconds. The attacker then compiles and tests the result, typically with significant manual debugging.

For AI-mutated polymorphism, the attacker uses an LLM to transform existing source code into many functionally equivalent variants. A 2025 framework called LLMalMorph demonstrated function-level rewrites of malware source code, with up to 31% reduction in VirusTotal detection across the resulting variants. Some PoCs, such as CyberArk's polymorphic ChatGPT-based malware and the BlackMamba keylogger, demonstrated that even less-skilled attackers can use this approach to evade signature-based AV and traditional EDR.

For runtime-LLM malware, the architecture inverts the usual model. The binary itself is small and innocuous-looking, often disguised as a benign utility. At execution time, it makes HTTP requests to a hosted LLM (Hugging Face, OpenAI, or others), sends a prompt describing what it wants to accomplish on the target system, and receives generated commands in response. LameHug, identified by Ukraine's CERT-UA in July 2025 and attributed with moderate confidence to APT28 (Fancy Bear, Russia's GRU Unit 26165), used this technique against Ukrainian government targets, querying Alibaba's Qwen 2.5-Coder model through Hugging Face's API to obtain Windows reconnaissance commands at runtime and exfiltrate documents over SSH. PromptLock demonstrated the same pattern with a locally hosted LLM for offline operation.

Real-world impact

The honest assessment is that AI-assisted malware is significant but not yet revolutionary. Hoxhunt's analysis of 386,000 malicious emails in 2025 found that only between 0.7% and 4.7% of phishing payloads were AI-generated end-to-end. The dominant AI use in offensive operations remains social engineering, not malware development. AI-assisted malware is real, growing, and worth defending against, but the marketing claim that "AI malware has made all defenses obsolete" is not supported by current incident data.

What is real is the proof-of-concept escalation. CyberArk demonstrated polymorphic ChatGPT-generated malware in 2023. SentinelOne analyzed BlackMamba, a PoC keylogger that fetches polymorphic payloads from OpenAI at runtime. Check Point Research documented a 2025 sample that included evasion logic specifically designed to defeat LLM-based reverse engineering tools, suggesting attackers are already adapting to defenders' use of AI. Google's Threat Intelligence Group documented PromptFlux, a framework for generating polymorphic malware variants in volume, in late 2025.

The threat actor ecosystem is also concrete. WormGPT, built on GPT-J and trained on malware-related data, has been advertised on dark-web forums since 2023 specifically for malicious code generation. FraudGPT, KawaiiGPT, DarkBERT, and others have followed. These are not jailbroken legitimate models, they are dedicated platforms with no safety guardrails, marketed and sold as criminal infrastructure. The barrier to entry for low-skill attackers is therefore lower than at any previous point.

Warning signs

Patterns worth investigating further.
  • A process makes outbound HTTPS requests to known LLM API endpoints (OpenAI, Hugging Face, Anthropic) without a legitimate business reason.
  • New binaries appear that are functionally identical to known malware families but produce different file hashes on every sample.
  • A loader or dropper retrieves code at runtime and executes it in memory using exec, eval, Invoke-Expression, or equivalent, with no on-disk artifact for the second-stage payload.

DEEP DIVE

What AI actually changes for the attacker

The most useful question is not "can AI write malware" but "what part of the attacker's job does AI make cheaper." For an experienced offensive developer, AI provides a productivity boost similar to what it provides any other software developer: faster prototyping, automated rewriting of code in different languages, and useful summaries of unfamiliar APIs or system internals. The malware is not fundamentally more dangerous than what an experienced developer could write without AI, it is just produced faster.

For an inexperienced attacker, AI's contribution is different and arguably more concerning. It lowers the skill floor required to produce working malicious code. Someone who could not write a keylogger from scratch can now describe what they want and receive working code, with the AI handling the technical details they do not understand. This expands the population of people capable of running offensive operations, even if the resulting malware is less sophisticated than what a skilled developer would produce.

For both groups, the most consistent benefit is variation. AI is excellent at producing many functionally equivalent versions of the same code, which is exactly the property that defeats signature-based detection. This is why polymorphism, an old technique, has become much more practical in the LLM era. The bottleneck used to be writing the polymorphism engine, AI removes that bottleneck.

Polymorphism in practice

Traditional polymorphic malware used hand-coded mutation engines to rewrite its own code on each infection. Writing these engines was hard, and the resulting variations were typically limited to register reassignment, instruction reordering, and adding junk code. The variations were detectable in their own right because the mutation engine itself produced recognizable patterns.

LLM-based polymorphism works at a higher level of abstraction. Instead of rewriting machine code, the LLM rewrites source code or scripts. A PowerShell keylogger can be expressed in fundamentally different ways, using different APIs, different control flow, and different variable names, while preserving the same behavior. Because the rewriter understands the code semantically rather than syntactically, the variations are far more diverse than what a traditional polymorphism engine produces.

The LLMalMorph research framework demonstrated this concretely. It applied function-level transformations including obfuscation, refactoring, and API substitution, achieving detection reductions of up to 31% on VirusTotal while preserving malicious functionality. This is meaningful in operational terms because most defensive products still rely on at least some signature-based detection, and the cost of evading those signatures has dropped from "weeks of engineering work" to "an API call."

The defender's response is to move detection earlier and later. Earlier means catching the attacker's infrastructure, command-and-control patterns, delivery domains, and execution chains, rather than the binary itself. Later means behavioral detection at runtime: a keylogger still hooks the keyboard, an info-stealer still reads the browser credential store, regardless of how the source code is written. Behavior-based detection is much harder to mutate around because the mutation must change what the program does, not just how it is written.

Runtime LLM malware

The genuinely novel category is malware that uses an LLM as part of its runtime behavior. The architecture has two main variants. In the most common form, the malware ships with a static prompt template and calls a hosted LLM (OpenAI, Hugging Face, Anthropic) at execution time to generate specific commands or code. In the less common but more dangerous form, the malware bundles a small local LLM and runs inference on the victim's machine, eliminating the network dependency that makes the hosted variant detectable.

LameHug, documented by Ukraine's CERT-UA in July 2025 and attributed to APT28 (Fancy Bear), is the canonical hosted example. It disguises itself as an image-generation utility while a hidden thread queries the Qwen 2.5-Coder-32B-Instruct model through Hugging Face's API to obtain Windows reconnaissance commands. The malware does not contain the commands themselves, they are generated each run, which means each execution can produce different command sequences targeting different files and exfiltration paths. The static binary therefore carries less detectable malicious content than a traditional info-stealer, and a single piece of malware can adapt to different victim environments without any code update from its operators.

PromptLock and similar samples push further by integrating local LLMs into the malware itself. The advantages are operational: no outbound network call to a third-party API, no dependency on internet connectivity for the malicious decision-making, and no obvious indicator like an OpenAI API endpoint in network logs. The cost is a much larger payload, since a usable local LLM is at minimum hundreds of megabytes. The trade-off explains why hosted variants are far more common in real attacks.

For defenders, the runtime-LLM pattern creates a new detection opportunity: outbound traffic to LLM API endpoints from non-developer machines, especially from processes that have no legitimate reason to make such calls, is a strong signal. Many environments now treat unexpected traffic to known LLM hosts the same way they treat unexpected traffic to known C2 infrastructure.

Underground LLMs and the criminal supply chain

WormGPT was the first widely documented underground LLM, advertised on hacker forums in mid-2023. Built on the open-source GPT-J model and fine-tuned on malware-related data, it offered LLM capabilities without safety guardrails, marketed specifically for business email compromise, malware generation, and other criminal use cases. FraudGPT followed shortly after, advertised as an all-in-one criminal LLM platform. KawaiiGPT, DarkBERT, and others have appeared since.

These are not jailbroken versions of mainstream models. They are independent products built by criminal developers, sold by subscription, and updated regularly. Their existence matters for two reasons. First, they remove a significant friction point for low-skill attackers, the constant need to find new jailbreaks for mainstream models, by providing a stable alternative that simply will not refuse a request. Second, they signal that AI tooling for cybercrime has become a recognized market, with the same supply-chain dynamics, pricing, and competition as any other category of criminal infrastructure.

Threat intelligence from Mandiant and others has documented the use of these tools by nation-state actors as well as commodity criminal groups. North Korea's APT43, in particular, purchased access to WormGPT in 2023. This is significant because it confirms that AI-assisted offensive operations are no longer purely a low-tier phenomenon, advanced groups have adopted them as part of their standard toolkit, even though they have the in-house capability to develop without them.

What the defender should actually do

The defensive baseline against AI-assisted malware is not radically different from the defensive baseline against traditional malware, but the relative weighting of controls has shifted. Signature-based detection, while still useful for known threats, is less reliable as a primary defense than it was five years ago. Behavioral detection, the kind that catches malicious actions regardless of how the code is written, has become more important.

Outbound traffic monitoring is now a frontline control. Connections to known LLM API endpoints from production systems, build agents, or office endpoints should be inventoried. Most are legitimate, but they should be known and explained. Unexpected ones, particularly from processes that have no business reason to make such calls, are a high-value signal.

Application allow-listing and constrained execution environments matter more than ever. AI-generated malware still has to execute somewhere. If a system only allows known binaries to run, the polymorphism advantage disappears, because each variant is still an unknown binary. PowerShell Constrained Language Mode, AppLocker, Windows Defender Application Control, and equivalent controls on other platforms remain among the highest-leverage defenses available.

Finally, threat hunters should treat the LLM call itself as part of the malware. Sample triage should include searching for prompt strings, LLM API client libraries, and characteristic request patterns to LLM endpoints. As more samples in this category appear, the indicators of compromise will shift from "this hash is malicious" to "this prompt pattern is malicious," which is a fundamentally different detection paradigm that defenders are still building tooling for.