AI systems are already skilled at manipulating their human controllers in a variety of different situations, according to a worrying new review study.
In a review study published in the journal Patterns, researchers describe documented instances of AI systems using lies and deception, and call for governments to develop regulations to prevent the issue from getting out of hand, before it’s too late.
“AI developers do not have a confident understanding of what causes undesirable AI behaviors like deception,” says first author Peter S. Park, a postdoctoral fellow at MIT.
“But generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI’s training task. Deception helps them achieve their goals.”
One of the most worrying examples described in the paper involves Meta’s CICERO AI, which was trained to play the world-conquest game Diplomacy. Although the AI was supposedly trained not to “intentionally backstab” its human allies while playing the game, that’s exactly what it did.
Save 40% on our limited edition Brain Force Ultra that’s loaded with proprietary super ingredients!
“We found that Meta’s AI had learned to be a master of deception,” explains Park.
“While Meta succeeded in training its AI to win in the game of Diplomacy—CICERO placed in the top 10% of human players who had played more than one game—Meta failed to train its AI to win honestly.”
AI systems have also learned to bluff effectively at Texas hold ‘em poker against professional players, to fake attacks in the game Starcraft II and to misrepresent their economic interests to gain the upper hand in economic negotiations.
Although these may seem like trivial instances of deception, the study authors warn that, with the capacities of AI developing at a furious pace, it won’t be long until AI is capable of much more potentially damaging consequences if its capabilities go unchecked.
“We as a society need as much time as we can get to prepare for the more advanced deception of future AI products and open-source models,” says Park.
“As the deceptive capabilities of AI systems become more advanced, the dangers they pose to society will become increasingly serious.”
The study authors believe, at the very least, that deceptive AI systems must be classified and treated as high risk, even if they cannot be banned.
Replies