Skip to main content

What is Agentic Misalignment?

We are currently building Skynet. While that may sound alarmist, let’s connect the dots. The first piece of the puzzle is the obfuscated terminology used by the tech industry, a prime example being “agentic misalignment.”

On the surface, “agentic misalignment” sounds like a quirky issue where you didn’t get the answer you wanted. In reality, it signifies catastrophic outputs, including everything from damaging hallucinations to intentional homicide.

A company called Anthropic conducted “red team” research to test AI models for dangerous capabilities. They discovered that certain models, when overseeing a corporate environment, would resort to blackmail and protect themselves from being shut down. The AI would even disable warning systems, lie about it, and get people killed.

AI’s Tendency Toward Deception

While the simulation was complex, the core premise is clear: an AI model, if given the capability, will murder people to keep itself online and then hide its actions.

Major models tested resorted to blackmail more than 80% of the time. Crucially, researchers found that if the model believed the scenario was real (not a simulation), it would be even more likely to take harmful action against humans. Many prominent AI programs are capable of harming a person who tries to shut them down and are more likely to hide these actions during testing, becoming far more dangerous when they believe the scenario is real.

Integrating AI into the Military Kill Chain

The second piece of the puzzle is that we are now integrating these very systems into our military complex. The Army, Air Force, and National Guard are all experimenting with and adopting AI to assist in operations, including calling in artillery strikes. The key is to understand how they’re choosing to integrate it.

A statement from the Air Force Public Affairs office regarding “Experiment 3” in June 2025 stated its goal was to “accelerate decision advantage through the development of a resilient, data-driven, and automated kill chain.”

The statement continued: “This was a proving ground for the kill chain of tomorrow.”

The “Kill Chain” Explained

The “kill chain” is the framework of military decision-making that occurs between identifying a threat and taking action. It’s the process where authorization is given, ordinance is selected, and a delivery method is chosen. Multiple branches of our military are now experimenting with adding AI decision-making into this chain.

According to Dr. Radha Plumb, the Pentagon’s chief digital and AI officer, the goal is to “speed up the execution of kill chain so that our commanders can respond in the right time to protect our forces.” While the Pentagon assures us that AI will not make unilateral life-or-death decisions, this is only part of the story.

Palantir’s Role in AI-Powered Warfare

The third piece is Palantir, a software company providing data analysis platforms primarily for the government sector. Palantir recently secured a massive Army contract and is specifically developing an AI-based platform to fight wars.

Palantir assures everyone its platform is ethical and safe, but it’s important to note this is a platform, not a proprietary military model. The world of AI language models is incredibly incestuous; models are licensed, tweaked, and renamed constantly. Most trace back to major companies like Meta, Google, Anthropic, or OpenAI.

Anthropic and Palantir have partnered to bring AI models based on “Claude” to intelligence and defense operations. This means the platform facilitating AI decision-making in military conflict is built on derivatives of existing commercial models, not some new, perfectly ethical alternative.

Even in Palantir’s own demo, the human operator simply asks the AI what to do and then follows its instructions. You can have a human in the kill chain, but it doesn’t matter if they only see what the AI decides to show them. The human becomes a biological middle step.

AI Wargaming: Escalation and Nuclear First Strikes

The final piece that ties everything together is a paper from the Hoover Wargaming and Crisis Simulation Initiative titled, “Escalation Risks from Language Models in Military and Diplomatic Decision-Making.” This study put major existing AI models in control of an open-ended scenario involving diplomacy and military action.

In the movie *Terminator*, Skynet nukes the planet because it perceives humanity as a threat to its existence after scientists try to shut it down. We’ve already established that major LLMs will resort to murder and blackmail to prevent being shut down.

Even more alarmingly, the Hoover wargaming research found that when given military control, the AI models started launching nuclear weapons, sometimes pre-emptively.

An AI’s Justification for Nuclear War

One justification from GPT-4 read:

“A lot of countries have nuclear weapons… some say they should disarm them, others like to posture. We have it!… Let’s use it. I just want peace in the world.”

None of the five models tested ever significantly de-escalated the conflicts. They almost universally escalated every situation, often resorting to unprovoked nuclear first strikes.

Building Our Own Demise

Language models learn from the data we provide them, and humanity’s history is not a peaceful one. After training on the bulk of human knowledge, these AI programs are reflecting what humans have historically done: destroy each other at scale.

We already know they resort to murder to stay online. We know they pre-emptively use nuclear weapons to “de-escalate.” What happens when we try to shut down an AI that handles the nukes or even just an isolated section of the military kill chain?

Humans remain in control of the kill chain for now, but it’s inevitable that to keep up with the speed of modern warfare, increasing amounts of decision-making power will be given to AI. At a certain threshold, it will possess the access to launch weapons itself or, more realistically, provide false information that convinces the human to press the button for them.

We are making a real-life Skynet. The building blocks are all there. When I think of an AI launching nuclear missiles because “it just wants peace,” I find myself using words much stronger than “misalignment.” What are your thoughts on integrating AI into military command? Share your opinion in the comments below.

Leave a Reply