NATO updates AI strategy for the age of ChatGPT

Immediate Response 14: Regional security through NATO partnerships

Host nation Slovenian Soldier’s navigate the complex digital environment that makes up the vast majority of the simulated training during Immediate Response 14, a NATO partnership building exercise. (Staff Sgt. Caleb Barrieau / Joint Multinational Readiness Center)

NATO SUMMIT 2024 — Three years ago, when NATO leaders met in Brussels and issued the alliance’s first strategy for artificial intelligence, Russian President Vladimir Putin hadn’t yet tried to blitzkrieg Kyiv and OpenAI’s ChatGPT hadn’t yet taken the internet by storm. Today, in a very different world, as NATO leaders met in Washington to discuss the threat from Moscow, they’ve also issued a revised AI strategy that puts the new phenomenon of generative AI as a top priority.

The full document has not yet been released, and the official summary made available today includes few details. One nuance worth noting, however, is how relatively optimistic the alliance seems to be about genAI.

Yes, there is a weighty paragraph which warns that “disinformation … and AI-enabled information operations might affect the outcome of elections, sow division and confusion across the Alliance, demobilize and demoralize societies and militaries in times of conflict as well as lower trust in institutions and authorities of importance to the Alliance.” This is definitely a more detailed and dire discussion of disinformation than the comparable but shorter, less urgent passage in the 2021 summary.

Yet the 2024 revision puts even more emphasis on urging NATO nations to embrace the upside of generative algorithms. “These technologies can generate complex text, computer code, and realistic images and audio, at near-limitless volume, that are increasingly indistinguishable from human-produced content. It is vital for NATO to use these technologies, where applicable, as soon as possible.”

Just how NATO should employ genAI is left unsaid, but there are plenty of precedents in the US, which is well ahead of Europe on this front. The Defense Department launched its top-level effort to get a handle on genAI, Task Force Lima, almost a year ago. The Intelligence Community started using a secure Microsoft chatbot in May, the Air Force deployed its NIPRGPT tool in June, and the Army aims to roll out its own secure genAI pilot this month.

Large Language Models can usefully replace or at least augment human workers in dealing with masses of mind-numbing detail, said Linden Blue, CEO of General Atomics, builder of the famed Predator and Repear drones. “There are so many things like, for example, writing […] operating manuals, [which is] extremely laborious and takes hours and hours of humans looking through data,” Blue told a Chamber of Commerce forum for industry Tuesday on the sidelines of the NATO conference.

At the high end, he went on, AI is in the works to analyze signals intelligence, radar signals, and other “extremely data-intensive” intel: “I think it will be fielded and coming as a near term thing,” he said.

In the near term, US defense and intelligence agencies have put their emphasis on using Large Language Models to churn through vast amounts of text — contracts, requests for proposals, logistics data, even military intelligence reports — to produce digestible summaries or, in the most ambitious cases, suggest possible plans.

They’ve also discouraged their personnel from using publicly available chatbots for work purposes, citing risks from AI’s deceptively plausible “hallucinations,” to deliberately “poisoned” data, to data-hungry developers sucking up sensitive user data to train new algorithms. It’s this distrust of “wild” chatbots that’s driven the Pentagon to develops its own customized and, hopefully, more secure models in “cages,” as the chief of Task Force Lima, Navy Capt. M Javier Lugo, likes to put it.

Another major emphasis for Pentagon AI efforts in general, not just with genAI, has been for careful testing to ensure the algorithms perform as expected. Realistic testing is hard enough for hardware, but it’s even trickier with machine-learning algorithms that modify themselves as they ingest new data, often in ways opaque even to their original programmers.

The new NATO strategy picks up this idea and expands the 2021 version’s brief suggestions about testing into something much more detailed and directive. The official to-do list in the 2024 revision calls on NATO to set up “key elements of an Alliance-wide AI Testing, Evaluation, Verification & Validation (TEV&V) landscape able to support the adoption of responsible AI.”

What’s more, the strategy continues, “These elements will utilize the network of DIANA affiliated Test Centres.” (Emphasis added). By contrast, the 2021 version – which predates the official launch of DIANA – merely states, “Under the forthcoming Defence Innovation Accelerator for the North Atlantic (DIANA), national AI test centres could support NATO’s AI ambition.” (Emphasis added).

“Allied testing facilities need to be able to determine if AI applications can be used safely and in accordance with NATO’s PRUs [Principles of Responsible Use’,” the strategy also demands. These six NATO principles, adopted in 2021, strongly reflect the US Defense Department’s five AI ethics principles adopted the year before.

More recently the US Defense and State Departments have led an international push to systematize and institutionalize best practices. That’s a call to which NATO allies have been responsive, as shown by the continued emphasis on “responsible AI” in this revision of the AI strategy.

Breaking Defense’s Carley Welch contributed to this report.