Poker Robot Implementing Defense Innovation Unit Programme
Early last year Tuomas Sandholm – a professor at Carnegie Mellon University – started a company called Strategy Robot. With it, he intends to carry forward the technology he developed for Libratus – a poker ‘bot’ that has advanced our understanding of tackling real-world complex problems by addressing the uncertainties and vagaries of high-stakes poker – to address other ‘incomplete information’ problems, such as wargames and strategic military simulations.
In August the Defense Innovation Unit – a DoD organisation that seeks to leverage commercial advances rapidly – awarded Strategy Robot a $10 million contract to investigate how to apply his algorithms and artificial intelligence (AI) protocols to military simulations.
It turns out that the lessons learned in trying to defeat human opponents in a game of strategy and psychology such as poker – in which there is never enough information for an entirely logical decision process to be reliable (hence the prospect of losing) – can be applied to much more complex (and critical) problems, such as military planning and mission rehearsal.
Expenditure on AI and machine learning is rising almost exponentially: the global figure was estimated at $12 billion in 2017 and is expected to reach over $57 billion by 2021, according to International Data Corporation. The world has moved on since an AI construct defeated a chess champion in 1997 and the secret is this ability to work with incomplete information. Filling in the gaps and making deductive or purely intuitive leaps is what brings the sometimes disturbing levels of genius and success that observers see in AI experimentation. The benefits, however, could be significant. In current wargames design only a limited number of potential strategies are ‘designed in’ for use by the opposing force – a function of capacity of the game engine to deal with huge numbers of options for significant numbers of entities. The use of AI could hugely enable computer generated forces (CGF) in training scenarios, making them more unpredictable (and therefore more valuable to the training objectives) and stressing the capabilities of the trainees. Doint that is a risk-free environment has got to be worth paying good money for and supporting with every available training asset.