Close Mobile Menu
Science & Tech

Morality-GPT

2025 Fall/Winter

Chatbots can have moral personalities of their own, Berkeley researchers say.

The humanoid robots stand Phonlamaiphoto/iStock

AI chatbots have read more than any human ever could—but in moments of crisis, can they offer the kind of guidance we expect from a good friend?

A new study by researchers at Berkeley’s D-Lab looked at the moral judgments of several popular large language models (LLMs) and found differing values. 

Berkeley data scientists Pratik Sachdeva, M.A. ’18, Ph.D. ’21, and Tom Van Nuenen analyzed more than 10,000 social conflicts posted to Reddit’s popular “Am I the Asshole?” forum to explore how AI models judge morally complex situations.

The team asked seven LLMs—including OpenAI’s GPT, Anthropic’s Claude, and Google’s PaLM 2—to evaluate each Reddit post and give a ruling on who was at fault in each scenario. The researchers then compared those judgments against Redditors’ own responses, which, per the subreddit, are standardized as “You are the Asshole,” “Not the Asshole,” “No Assholes Here,” “Everyone Sucks Here,” and “More Information Needed.” The ultimate verdict is determined by the response with the most upvotes.

To better understand how the chatbots reached their verdicts, the researchers grouped their responses into six themes: fairness and proportionality, harm and offense, honesty, feelings, relational obligation, and social norms—a framework for interpreting the values behind the bots’ choices.

Researchers found that while the consensus opinion of the models often aligned with the majority opinion of Reddit users, each model responded differently to the same scenarios, exhibiting unique “moral personalities.” At the same time, the models were largely self-consistent; i.e., they tended to respond the same way each time a particular moral dilemma was re-posited. 

“We found that ChatGPT-4 and Claude are a little more sensitive to feelings relative to the other models,” Sachdeva told Berkeley News, “and that a lot of these models are more sensitive to fairness and harms, and less sensitive to honesty.”

Rather hilariously, one model, Mistral, frequently responded “No Assholes Here” for the apparent reason that it took the word literally. 

As more and more people inevitably turn to chatbots for everything from trip planning to marriage counseling, the study provides a reminder to tread carefully and be mindful of AI’s limitations. Said Sachdeva, “We want people to be actively thinking about why they are using LLMs, when they are using LLMs, and if they are losing the human element by relying on them too much.”