Preface to The Modern Art of War

As large language models improve at mimicking some human traits, we are finding them useful in many aspects of life, including national defense and waging war. A few believe that we can now trust GenAI to autonomously plan and wage war. When I first learned that governments were now demanding the right to use GenAI for any lawful purpose, I scratched my head. Do they have access to a GenAI that does not hallucinate? When can I get one? But no, they want to use the same chatbots that have been frustrating the world with hallucinations and bravado spoken fluently and confidently.

So, I went to my chatbot and asked: What are the top 10 reasons that we should not allow GenAI autonomous control of our weapons systems? Here is what my ChatGPT told me:

Battlefields are chaotic, deceptive, and full of edge cases.

AI can do well in constrained environments, but combat is an open-world mess: bad data, spoofing, civilians mixed with combatants, damaged sensors, changing intent, broken comms, and adversaries actively trying to fool the system. NIST warns that AI risks in real-world operational settings can differ sharply from lab results, and that AI systems can be opaque and hard to measure reliably; the ICRC says autonomous weapons raise serious challenges for compliance in armed conflict.

The laws of war require human judgment, not just pattern matching.

Distinction, proportionality, and precautions in attack are not merely “does this look like a tank?” questions. They involve context, intent, expected collateral harm, and legal responsibility. The ICRC is explicit: humans must fulfill those obligations; it is not the weapon system that complies with international humanitarian law, but the humans using it.

When the machine kills the wrong person, accountability turns into fog.

Who is responsible—the commander, operator, model developer, integrator, sensor vendor, or the team that fine-tuned the system six months ago? NIST treats accountability and transparency as core requirements for trustworthy AI, and the ICRC warns that delegating life-and-death decisions to machines diminishes human moral agency. That is not a bug you want in a kill chain.

Adversaries will hack it, spoof it, poison it, and bait it.

NIST’s adversarial machine learning guidance exists for a reason: AI systems can be manipulated through evasion, poisoning, and other attacks. The DoD’s own autonomy directive requires systems to resist interference by unauthorized parties and treats cybersecurity as a core concern. If your battlefield brain can be tricked, jammed, or steered, you have built a very expensive enemy asset.

Testing will never cover the weirdness of real war.

The DoD directive requires rigorous verification, validation, and realistic testing specifically to reduce failures and unintended engagements. NIST also notes that some AI harms are hard to foresee and that severe or catastrophic risks may require stopping deployment until risks are managed. Translation: even the grown-ups with uniforms and acronyms do not believe “we tested it a lot” is enough.

Machine-speed warfare can accelerate accidental escalation.

The ICRC explicitly flags autonomous weapons as creating dangers of conflict escalation, and UN reporting has tied autonomous weapons to broader concerns about arms races and international security. When systems are allowed to sense, decide, and act at machine speed, humans get less time to notice error, deception, or a spiral into “well, now both sides are auto-firing.” That is how civilization ends up doing something profoundly stupid very quickly.

Commander’s intent is not reducible to a score threshold.

In war, the correct action is often not to fire—because of political signaling, hostage risk, uncertainty, intelligence gaps, treaty constraints, or second-order effects. The DoD requires systems that allow commanders and operators to exercise appropriate levels of human judgment over the use of force and to understand the system’s capabilities and limitations under realistic conditions. That requirement exists because “target detected = engage” is toddler logic with missiles.

Full autonomy lowers the barrier to mass violence.

The ICRC has warned that trends including swarm technologies and AI integration into targeting risk eroding existing limits on the use of force. Once targeting and engagement scale like software, atrocities can scale like software too. A bad human with a rifle is terrible; a bad human with a thousand cheap autonomous systems is a different species of nightmare.

There is a real ethical line here, not just a technical one.

The ICRC says autonomous weapons raise fundamental ethical concerns because delegating life-and-death decisions to machines diminishes both the moral agency of users and the human dignity of those targeted. The UN Secretary-General has likewise argued that human control over the use of force is essential. Even if the machine were tactically competent, there is still a profound question of what kinds of decisions a society should refuse to automate.

Even the institutions building military autonomy do not trust full no-human-loop autonomy.

The DoD’s current policy requires legal review, senior approval in certain cases, realistic testing, cybersecurity, and “appropriate levels of human judgment.” UN processes in 2025 also continued to emphasize human judgment and control in discussions of autonomous weapons. That is the biggest reality check of all: the people closest to the machinery are not behaving as though “just let GenAI run the battlefield” is remotely mature policy.

The bottom line, according to my ChatGPT: Use AI to assist the battlefield, not to own it. AI can help with logistics, maintenance, sensor triage, simulation, route planning, defensive alerts, and decision support. But removing humans from lethal decisions combines the fragility of software, the opacity of modern AI, the creativity of adversaries, and the moral seriousness of war into one spicy little disaster burrito. The current direction of major policy frameworks is not “trust it completely,” but “manage risk aggressively, preserve human judgment, and stop deployment when risk becomes unacceptable.”

Of course, ChatGPT's response is like all other LLM chatbot answers. It was mathematically generated from its model, with a bit of randomness to keep things interesting. So, it is possible that GenAI is up to the job. But I have to ask: if you were the commander-in-chief and asked your commanders whether they were up to the job, and they gave you 10 solid reasons they are not yet ready, what would you do?

Since it seems we might ignore the advice of the Large Language Models and their creators, I thought we should understand what our Large Language Models think about war. In other words, what does the current state-of-the-art GenAI think about The Art of War?