Beyond ChatGPT: Experts say generative AI should write — but not execute

What role could AI play in war planning? (Graphic by Breaking Defense, original images via Getty and DVIDS)

WASHINGTON — Chatbots can now invent new recipes (with mixed success), plan vacations, or write a budget-conscious grocery list. So what’s stopping them from summarizing secret intelligence or drafting detailed military operations orders?

Nothing, in theory, said AI experts from the independent Special Competitive Studies Project. The Defense Department should definitely explore those possibilities, SCSP argues, lest China or some other unscrupulous competitor get there first. In practice, however, the project’s analysts emphasized in interviews with Breaking Defense, it’ll take a lot of careful prep work, as laid out in a recently released SCSP study.

And, they warned, you’ll always want at least one well-trained human checking the AI’s plan before you act on it, let alone wire the AI directly to a swarm of lethal drones.

“Right now you can go on ChaGPT and say, you know, ‘Build for me a schedule for my kids’ lunch boxes for like the next five days,’” said Ylber Bajraktari, a veteran Defense Department staffer now serving as a senior advisor to SCSP. With a little more programming, he added, “it could connect to Instacart or whatever [and] can order all of those instantaneously, and that will get shipped to you.”

“The technology is there,” Bajraktari said. “The question is plugging those in.”

Plugging in intelligence databases and military command systems, of course, will take a higher standard of engineering than automating menus.

“Specific military tasks [require] a much greater degree of specificity than, you know, creating a great Chicken Parmesan,” said Justin Lynch, a former Army officer and Hill staffer who heads SCSP’s defense studies. “[It] would require better tuning for the LLM [Large Language Model]. It needs to be something that’s more specifically created” for military purposes, he explained: trained on such data as military intelligence, official doctrine, and other verified sources, rather than scraping text from Reddit posts and other public websites as OpenAI did for ChatGPT.

“That is an active area for experimentation right now,” he told Breaking Defense.

If the generative AI could generate a complete military operational plan, could it then also execute that plan, perhaps by transmitting orders directly to a swarm of combat drones? That’s technologically conceivable but a terrible idea, Lynch said, and it’s not at all what SCSP is recommending.

“Everything we’re talking about is very firmly within 3000.09 [PDF],” he emphasized, citing the recently updated DoD directive that mandates (with major loopholes) human control of lethal force.

“We strictly limited ourselves to what I would characterize as sort of a ‘cognitive copilot,’” Bajraktari said. “It’s all about like enhancing human decision-making, planning, information intake, information processing, rather than execution of the missions… The technology is not there [for] automated execution.”

You obviously don’t want the SkyNet problem from the Terminator movies, where a rogue military AI decides humanity itself is the enemy and launches nuclear missiles. More subtly, Lynch said, you don’t want “the Thomas Beckett problem,” where an agent (human or AI) misunderstands instructions and does something counterproductive while sincerely trying to help.

That’s a reference to the 1170 murder of Saint Thomas Beckett, Archbishop of Canterbury, in his cathedral by knights loyal to King Henry II. The king, fed up with years of political wrangling between church and state, is said to have exclaimed “Who will rid me of this troublesome priest?” — only to have some well-armed henchmen take him a little too literally. Now remember AI can be more literal-minded than any human and imagine someone making a poorly worded request to “eliminate” or “remove” a problem.

You always want a knowledgeable human double-checking the AI-generated plan to catch any errors, the SCSP experts emphasized, just as any human writer needs an editor. [Editor’s Note: Correct.] While that slows the pace somewhat compared to pure automation, they acknowledged, it takes a lot less time for a human to check an existing draft than to generate it from scratch, so it should still save significant time over today’s manual and labor-intensive staff processes.

In outline, the SCSP experts envision three tiers of military and intelligence uses for generative AI, from the simplest near-term applications to the most ambitious ones requiring years of careful prep work and experimentation:

Content generation: SCSP starts with the kind of work that chatbots already do, only custom-tailored for tasks like summarizing secret intelligence reports or military logistics inventories. The more accurate and specific the datasets the AI is trained on, the more accurate and actionable its answers, in contrast the “jack of all trades, master of none” approach adopted by ChatGPT and other consumer AIs. Most military users would want especially strong safeguards in place against AI “hallucinations” that output falsehoods; but, CIA alumnus and SCSP intelligence expert Chip Usher said, the Agency could use AI’s predilection for fiction to create detailed fake backgrounds for its agents, potentially including years of convincing social-media posts.
Automated orchestration: At the next level, instead of just drawing on a single dataset, however large, the AI could “call up information from a large set of databases or tools,” Lynch explained. “I’ve spent a lot of time working in Tactical Operations Centers, and usually you have access to hundreds of databases and tools, and time to become familiar with a much smaller number of them…. A large language model can do automated orchestration of [those] and then pull up the most relevant data sets, the most relevant analytic tools for you.”
Agentic AI: This is where the AI might go from recommending tasks to executing them, albeit under strict controls. The SCSP experts envision the generative AI tasking subordinate AIs to gather information to fill in gaps in its data, for example, or to organize supply convoys — not to launch autonomous lethal swarms.

The higher levels in this framework build on generative AI’s ability to take a broad, plain-English request — from “plan my kids’ meals for next week” to “come up with a logistics plan for my battalion” — and turn it into a detailed checklist of specific tasks. The AI doesn’t actually understand the tasks, Lynch emphasized, but it has seen enough human-written checklists to generate its own, similar ones when asked.

Likewise, the AI doesn’t really “understand” what its human masters are asking it to do. It has no awareness of external reality, only of statistical correlations among words and parts of words (“tokens” in the jargon). But LLMs have gotten really good at turning human-language input into machine-readable 1s and 0s and then turning their output into intelligible words. That allows them to act as intermediaries between humans — even untrained humans — and ever-more-sophisticated algorithms.

“What’s different with ChatGPT, suddenly you have this interface, [where] using the English language, you can ask it questions,” the former director of the Pentagon’s Joint AI Center, retired three-star general Michael Groen, told Breaking Defense this past spring. “It democratizes AI [for] large communities of people.”

“Many of the tools we are seeing emerge have natural language interfaces that dramatically increase accessibility and lower barriers to operating in certain sectors without years of intense training or education,” SCSP’s Bajraktari said. The downside is that it’s easier than ever for a few ill-intentioned individuals to mass-produce misinformation, malware, or even lethal weapons like homemade bombs or poison gas. The upside is a tremendous potential for people without intensive technical training in AI to take advantage of powerful new AI technologies.

In fact, between the rise of AI and the rise of China, the SCSP report argues, humanity is facing a combination of technological change and geopolitical instability with an unnerving “resemblance to the era before World War I.” That’s less than reassuring, but with some expert (human and maybe non-human) advice, perhaps the world can navigate the turmoil better this time.