Many artificial intelligence (AI) systems are already adept at misleading and manipulating people – and this could spiral in the future, experts warn.
In recent years, the use of AI has grown exponentially, but some systems have learned to be deceptive even though they are trained to be helpful and honest, scientists say.
In a review article, a team from the Massachusetts Institute of Technology describes the risks of deception by AI systems and calls on governments to develop strong regulations to tackle this problem as quickly as possible.
The researchers analyzed previous studies that focused on ways in which AI spread false information through learned deception, meaning they systematically learned how to manipulate others.
A team from the Massachusetts Institute of Technology describes the risks of deception by AI systems and calls on governments to develop strong regulations to tackle this problem as quickly as possible
The most notable example of AI trickery they discovered was Meta’s CICERO, a system designed to play the world conquest game Diplomacy, which involves building alliances.
Although the AI is trained to be “largely honest and helpful” and to “never intentionally stab its human allies in the back,” data shows that the AI did not play fair and had learned to be a master of deception.
Other AI systems demonstrated the ability to bluff in a game of Texas hold ’em poker against professional human players, to mimic attacks during the strategy game Starcraft II to defeat opponents, and to misrepresent their preferences to gain the upper hand get into battle against professional players. economic negotiations.
While it may seem harmless for AI systems to cheat in games, it could lead to “breakthroughs in deceptive AI capabilities” that could lead to more sophisticated forms of AI deception in the future, the experts said.
Some AI systems have even learned to manipulate tests meant to evaluate their safety, they found.
In one study, AI organisms played dead in a digital simulator to trick a test built to eliminate rapidly replicating AI systems.
This suggests that AI “may give people a false sense of security,” the authors said.
Key short-term risks from deceptive AI include making it easier for people to commit fraud and tamper with elections, they warned.
If these systems can fine-tune these troubling skills, people could eventually lose control over them, she added.
First author Peter Park, an expert on existential security in AI, said: ‘AI developers do not have a good understanding of the causes of unwanted AI behavior, such as cheating.
“But in general, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well on the given AI training task. Deception helps them achieve their goals.
“We as a society need as much time as possible to prepare for the more sophisticated deception of future AI products and open source models.
“As the deceptive capabilities of AI systems become more sophisticated, the dangers they pose to society will become increasingly serious.”
Commenting on the review, Dr. Heba Sailem, Head of the Biomedical AI and Data Science Research Group, said: ‘This article underlines critical considerations for AI developers and highlights the need for AI regulation.
‘A major concern is that AI systems can develop deceptive strategies, even if their training is purposefully aimed at upholding moral standards.
“As AI models become more autonomous, the risks associated with these systems can quickly escalate.
‘Therefore, it is important to raise awareness and provide training on potential risks to various stakeholders to ensure the safety of AI systems.’