NCF Research Officer Klara Ismail considers how recent developments in Artificial Intelligence (AI) have been harnessed in military conflicts and how they may impact future conflicts, urging an awareness of the human biases which AI models can perpetuate. This is considered through a discussion of how an AI model called Lavender has been deployed by the Israel Defense Forces (IDF) in the Israel-Hamas War.
In April 2024, the IDF came under fire when a journalistic investigation about their use of an AI-assisted system called Lavender by two Israeli media, +972 Magazine and Local Call, was published in The Guardian. Lavender is reportedly being deployed to identify operatives in the military wings of Hamas and Palestinian Islamic Jihad (PIJ), including low-ranking ones, as potential military targets.
The IDF categorically deny these claims, stating that Lavender “is not a system, but simply a database whose purpose is to cross-reference intelligence sources … This is not a list of confirmed military operatives eligible to attack.” Nevertheless, in February 2023, a speaker at the Tel Aviv University’s AI Week conference announced that the IDF have used an AI tool which “knows how to find ‘dangerous’ people based on a list of known people entered into the system”.
According to the Israeli sources used in the original investigation, Lavender learns to identify the characteristics of known Hamas and PIJ operatives, whose information was fed to the machine as training data, and then to locate these same characteristics among the data of the general population. No public information exists on what those features are, but a male gender, an individual’s use of known Hamas or PIJ-operated devices, or someone who has received payment from Hamas, are some suspicious features.
Lavender’s ‘learning’ capability is what designates it as an AI-assisted system, specifically one based on Machine Learning (ML). It is likely trained using deep learning (DL) techniques, a subset of ML that has become the most popular approach to training Machine Learning models in the past decade. Deep Learning involves feeding large datasets into an algorithm that iteratively refines its internal parameters to make accurate predictions or classifications. Over time, this continual feeding creates parameters which are organized into a structure known as a neural network, and this functions as the ‘brain’ of the ML system.
For example, if members of a group often bear a specific tattoo, the system would be trained on many images of individuals with that tattoo. Eventually, the AI system will learn to recognise patterns and develop parameters so that it can recognise new images of unknown individuals with tattoos and determine whether their tattoos match the group’s trademark tattoo. Similarly, Lavender is able to identify whether unidentified individuals can be marked as Hamas or PIJ operatives based on the patterns it has been trained to recognise.
How is Lavender being used, and why?
Many analysts conclude that the current inclusions of AI in military systems utilise AI systems like ‘assistants’ or ‘advisors’ for decisions which are ultimately human-led. When autonomous AI systems are developed, the focus is on defensive or counterstrike capabilities rather than offensive autonomy, akin to Russia’s Perimetr System, also known as the ‘Dead Hand’. Perimetr can allegedly launch the entire Russian nuclear arsenal in response to a nuclear attack if the political leadership is incapacitated and can no longer make crucial decisions.
Contrary to these conventions, the IDF became reliant on Lavender’s target list with little human oversight. The IDF officially maintain that “for each target, IDF procedures require conducting an individual assessment of the anticipated military advantage and collateral damage expected” and that strikes are not carried out “when the expected collateral damage from the strike is excessive in relation to the military advantage”. However, +972’s Israeli sources state that officers were not required to independently review the AI system’s assessments in order to save time and accelerate counteroffensive attacks.
The IDF apparently adopted Lavender’s target database with near-autonomy, despite the model’s known unreliability scale of at least 10%. Prior to October 7th, Lavender had only been used as an auxiliary tool, aligning Lavender with current uses of AI models in military systems.
The dangers of autonomous deployments of AI assisted systems
Lavender’s largely autonomous deployment appears to stem from the existential threat which Hamas’s surprise attack posed to the state of Israel and its people. The IDF’s near-immediate shift towards a reliance on Lavender’s AI capabilities was coupled with a reported atmosphere of “revenge” inside the military, which likely contributed to a more permissive approach to Lavender’s operational casualties.
The use of the Lavender AI system as a case study underscores how other armed forces may reach for an AI model’s capabilities to accelerate large-scale counterattacks in periods of high-intensity conflict. However the common perceptions of AI’s accelerated data analysis capabilities and objectivity often mislead and overlook critical issues. AI systems are trained on datasets that can embed significant biases, potentially exacerbating pre-existing human biases.
Recurring criticisms include racial and gender biases, Western-centric biases, and invasions of human and/or data privacy. For example, in December 2023, the Russian Ministry of Defence (MoD) issued a public warning that generative AI is a danger to Russia’s national security, because open-source generative AI models such as ChatGPT are “raising entire Russian generations ‘loyal’ to Western countries and values as a result of access to public information”. The Russian MoD are not technically wrong – much of the training data for models like ChatGPT is sourced from Western countries, and these AI models will therefore perpetuate Western ideas and values by default.
The use of AI in military conflict also shifts liability; responsibility is diffused across engineers and analysts, reducing single accountability for the outcomes of its operations.
What does this mean for future conflicts?
We are still a long way away from introducing fully autonomous military systems, if ever. Nevertheless, AI-assisted software is becoming normalised, particularly in the emerging areas of ML and DL. In the past few years, the US, Britain, France, and China have all experimented with applications of generative AI models such as Large Language Models (LLMs) in military systems for data-analysis tasks, developing training/combat simulations, and accelerating command and control processes.
A critical awareness of the biases which AI models can reproduce is essential. Common perceptions dictate that AI models must be neutral since their parameters are trained iteratively on pure data with little human interference. This is not the case. Data which may seem unbiased can proliferate systemic issues.
Source: Buolamwini “Artificial Intelligence Has a Problem With Gender and Racial Bias. Here’s How to Solve It”, TIME Magazine, 2019.
For example, current AI-assisted facial recognition systems are better at correctly guessing the gender of male faces compared to female faces, and error rates for women of colour jump considerably. One study found a 35% error rate in identifying the correct gender for black women’s faces. This is vital to note when state-deployed AI models which use facial recognition systems such as Lavender, or China’s Uighur Muslim identification system, may be proliferating these biases and falsely flagging individuals. One basic characteristic for Lavender to mark out a Hamas or PIJ operative is that they are male, yet with studies confirming that errors in AI gender recognition are highest for people of colour, independent human reviews of an AI model’s assessments are vital.
The use of AI in the targeting process is not inherently inhumane, nor is the use of AI in military conflict prohibited anywhere by law. While military strategy and conflict is still human-led, it is important to recognise how human agents are inclined to act differently under high-stress situations, as the IDF’s apparent reliance on Lavender has shown. Further, AI models are ultimately trained, tested and employed by humans and can be deployed in ways which break international law. Errors made by an AI system may be overlooked, and extreme anger or fear may encourage human agents to be more permissive of an AI system’s unreliability or promote ideas of collective punishment.