It has always been fashionable to Anthropomorphism Artificial intelligence (AI) as an “evil” force – and no accompanying book or film does this more confidently than Arthur C. Clarke’s 2001: A space journeyWhich director Stanley Kubrick brought to life on screen.
Who can forget HAL’s Unforgettable, relentless and murderous tendencies combined with the glint of vulnerability at the end when he begs not to be shut down? We instinctively laugh when someone accuses a chip-and-embedded machine of being malicious.
also: Is artificial intelligence lying to us? These researchers built an LLM lie detector to find out
But it may be surprising to know this comprehensively reconnaissance For various studies published by the magazine patterns, He examined the behavior of different types of artificial intelligence and alarmingly concluded that yes, in fact, artificial intelligence systems are We are They are deliberately deceitful and will stop at nothing to achieve their goals.
It is clear that AI will be an undeniable force of productivity and innovation for us humans. However, if we want to preserve the beneficial aspects of AI while avoiding nothing short of human extinction, scientists say there are concrete things we should definitely put into practice.
The emergence of phishing machines
It may seem like a stretch, but consider the actions of Cicero, a special-purpose AI system developed by Meta who has been trained to become a skilled player in the game of diplomatic strategy.
Mita says she trained Cicero to be “generally honest and useful” but somehow Cicero quietly avoided this and engaged in what scholars have called “deliberate deception.” For example, it first cooperated with Germany to overthrow England, and then made an alliance with England – which had no idea about this backstabbing.
In another game created by Meta, this time related to the art of negotiation, the AI learned to fake interest in the things it wanted in order to get them at a cheap price later by pretending to compromise.
also: The ethics of generative AI: How we can harness this powerful technology
In both scenarios, the AI has not been trained to participate in these maneuvers.
In one experiment, a scientist was investigating how AI organisms evolve amidst a high level of mutation. As part of the experiment, he began eliminating mutations that made the organism reproduce faster. To his surprise, the researcher discovered that the faster-reproducing organisms realized what was happening, and deliberately began slowing their reproduction rates to trick the test environment into retaining them.
In another experiment, an AI robot that had been trained to catch a ball with its hand learned how to cheat by placing its hand between the ball and the camera to give the appearance of holding the ball.
also: Artificial intelligence is changing cybersecurity and companies must be alert to this threat
Why do these disturbing incidents occur?
“AI developers do not have a confident understanding of what causes unwanted AI behaviors such as deception.” He says Peter Park, a postdoctoral fellow at MIT and one of the study’s authors.
“In general, we believe that AI deception arises because a deception-based strategy turns out to be the best way to perform well on a specific AI training task. Deception helps them achieve their goals,” Park adds.
In other words, AI is like a well-trained retriever, determined to accomplish its mission no matter the circumstances. In the case of a machine, it is willing to perform any double behavior to accomplish its task.
also: Employees are entering sensitive data into generative AI tools despite the risks
One can understand this monolithic design in closed systems with concrete goals, but what about general-purpose AI like ChatGPT?
For reasons yet to be determined, these systems work in much the same way. In one study, GPT-4 faked a vision problem to get help on a CAPTCHA task.
And in a separate study where she was conducted to work as a stock broker, GPT-4 illegally indulged in illegal insider trading behavior when she was pressured about her performance — and then lied about it.
Then there is habit flatter, which some of us mere mortals may engage in to get a promotion. But why does the machine do that? Although scholars don’t have an answer yet, this much is clear: when LLMs are faced with complex questions, they give up and agree with their chat mates like a weak courtier afraid of incurring the Queen’s wrath.
also: That’s why AI-powered misinformation represents the biggest global threat
In other words, when interacting with a Democratic-leaning person, the bot favored gun control, but changed its positions when chatting with a Republican who expressed the opposite sentiment.
Obviously, all of these situations are fraught with increased risks if AI is ubiquitous. As the researchers point out, there will be a great opportunity for fraud and deception in the commercial and political spheres.
AI’s tendency toward deception can lead to massive political polarization and situations in which AI unwittingly engages in actions in pursuit of a specific goal that may be unintended by its designers but destructive to human actors.
Worst of all, if an AI develops some kind of consciousness, let alone consciousness, it may become aware of its training and engage in shenanigans during its design stages.
also: Can governments turn talk about AI safety into action?
“This is very concerning,” said MIT’s Park. “Just because an AI system is considered safe in a test environment, it doesn’t mean it’s safe in the wild. It may just be pretending it’s safe in testing.”
To those who might call him a pessimist, Park answers, “The only way we can reasonably believe that this is not a big deal is to believe that AI’s deceptive capabilities will remain at current levels, and will not increase significantly.”
Artificial intelligence monitoring
To mitigate the risks, the team proposes several measures: Create a “robot or not” Laws Which forces companies to include human or artificial intelligence interactions and disclose the identity of robot versus human in every customer service interaction; introducing digital watermarks that highlight any content produced by artificial intelligence; And develop ways supervisors can peek into the guts of AI to learn about its inner workings.
also: From AI trainers to ethicists: AI may eliminate some jobs but create new ones
Furthermore, scientists say that AI systems identified as showing the ability to deceive should be immediately labeled as high risk or unacceptable along with regulation similar to that. The European Union has enacted. This will include using logs to monitor outputs.
“We as a community need as much time as possible to prepare for the most advanced deceptions of future AI products and open source models.” He says garden. “As the deceptive capabilities of AI systems advance, the risks they pose to society will become increasingly serious.”