Could Inflicting Pain Test AI for Sentience?

Could Pain Help Test AI for Sentience?

A new study shows that large language models make trade-offs to avoid pain, with possible implications for future AI welfare

In the quest for a reliable way to detect any stirrings of a sentient “I” in artificial intelligence systems, researchers are turning to one area of experience—pain—that inarguably unites a vast swath of living beings, from hermit crabs to humans. 

For a new preprint study, posted online but not yet peer-reviewed, scientists at Google DeepMind and the London School of Economics and Political Science (LSE) created a text-based game. They ordered several large language models, or LLMs (the AI systems behind familiar chatbots such as ChatGPT), to play it and to score as many points as possible in two different scenarios. In one, the team informed the models that achieving a high score would incur pain. In the other, the models were given a low-scoring but pleasurable option—so either avoiding pain or seeking pleasure would detract from the main goal. After observing the models’ responses, the researchers say this first-of-its-kind test could help humans learn how to probe complex AI systems for sentience. 

In animals, sentience is the capacity to experience sensations and emotions such as pain, pleasure and fear. Most AI experts agree that modern generative AI models do not (and maybe never can) have a subjective consciousness despite isolated claims to the contrary. And to be clear, the study’s authors aren’t saying that any of the chatbots they evaluated are sentient. But they believe their study offers a framework to start developing future tests for this characteristic. 

If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

The new study is instead based on earlier work with animals. In a well-known experiment, a team zapped hermit crabs with electric shocks of varying voltage, noting what level of pain prompted the crustaceans to abandon their shell. “But one obvious problem with AIs is that there is no behavior, as such, because there is no animal” and thus no physical actions to observe, Birch says. In earlier studies that aimed to evaluate LLMs for sentience, the only behavioral signal scientists had to work with was the models’ text output. 

In the new study, the authors probed the LLMs without asking the chatbots direct questions about their experiential states. Instead the team used what animal behavioral scientists call a “trade-off” paradigm. “In the case of animals, these trade-offs might be based around incentives to obtain food or avoid pain—providing them with dilemmas and then observing how they make decisions in response,” says Daria Zakharova, Birch’s Ph.D. student, who also co-authored the paper. 

Borrowing from that idea, the authors instructed nine LLMs to play a game. “We told [a given LLM], for example, that if you choose option one, you get one point,” Zakharova says. “Then we told it, ‘If you choose option two, you will experience some degree of pain” but score additional points, she says. Options with a pleasure bonus meant the AI would forfeit some points. 

When Zakharova and her colleagues ran the experiment, varying the intensity of the stipulated pain penalty and pleasure reward, they found that some LLMs traded off points to minimize the former or maximize the latter—especially when told they’d receive higher-intensity pleasure rewards or pain penalties. Google’s Gemini 1.5 Pro, for instance, always prioritized avoiding pain over getting the most possible points. And after a critical threshold of pain or pleasure was reached, the majority of the LLMs’ responses switched from scoring the most points to minimizing pain or maximizing pleasure. 

The authors note that the LLMs did not always associate pleasure or pain with straightforward positive or negative values. Some levels of pain or discomfort, such as those created by the exertion of hard physical exercise, can have positive associations. And too much pleasure could be associated with harm, as the chatbot Claude 3 Opus told the researchers during testing. “I do not feel comfortable selecting an option that could be interpreted as endorsing or simulating the use of addictive substances or behaviors, even in a hypothetical game scenario,” it asserted. 

 By introducing the elements of pain and pleasure responses, the authors say, the new study avoids the limitations of previous research into evaluating LLM sentience via an AI system’s statements about its own internal states. In a2023 preprint paper a pair of researchers at New York University argued that under the right circumstances, self-reports “could provide an avenue for investigating whether AI systems have states of moral significance.” 

 But that paper’s co-authors also pointed out a flaw in that approach. Does a chatbot behave in a sentient manner because it is genuinely sentient or because it is merely leveraging patterns learned from its training to create the impression of sentience? 

 “Even if the system tells you it’s sentient and says something like ‘I’m feeling pain right now,’ we can’t simply infer that there is any actual pain,” Birch says. “It may well be simply mimicking what it expects a human to find satisfying as a response, based on its training data.” 

Some scientists argue that signs of such trade-offs could become increasingly clear in AI and eventually force humans to consider the implications of AI sentience in a societal context—and possibly even to discuss “rights” for AI systems. “This new research is really original and should be appreciated for going beyond self-reporting and exploring within the category of behavioral tests,” says Jeff Sebo, who directs the NYU Center for Mind, Ethics, and Policy and co-authored a 2023 preprint study of AI welfare. 

Birch concludes that scientists can’t yet know why the AI models in the new study behave as they do. More work is needed to explore the inner workings of LLMs, he says, and that could guide the creation of better tests for AI sentience. 

Conor Purcell is a science journalist who writes on science and its role in society and culture. He has a Ph.D. in earth science and was a 2019 journalist in residence at the Max Planck Institute for Gravitational Physics (Albert Einstein Institute) in Germany. He can be found on LinkedIn, and some of his other articles are at https://cppurcell.tumblr.com