The Good, The Bad and The Robot: Experts Are Trying to Make Machines Be “Moral”

By Coby McDonald

Good vs. bad. Right vs. wrong. Human beings begin to learn the difference before we learn to speak—and thankfully so. We owe much of our success as a species to our capacity for moral reasoning. It’s the glue that holds human social groups together, the key to our fraught but effective ability to cooperate. We are (most believe) the lone moral agents on planet Earth—but this may not last. The day may come soon when we are forced to share this status with a new kind of being, one whose intelligence is of our own design.

Robots are coming, that much is sure. They are coming to our streets as self-driving cars, to our military as automated drones, to our homes as elder-care robots—and that’s just to name a few on the horizon (Ten million households already enjoy cleaner floors thanks to a relatively dumb little robot called the Roomba). What we don’t know is how smart they will eventually become. Some believe human-level artificial intelligence is pure science fiction; others believe they will far surpass us in intelligence—and sooner rather than later. In either case, a growing number of experts from an array of academic fields contend that robots of any significant intelligence should have the ability to tell right from wrong, a safeguard to ensure that they help rather than harm humanity.

“As machines get smarter and smarter, it becomes more important that their goals, what they are trying to achieve with their decisions, are closely aligned with human values,” says UC Berkeley computer science professor Stuart Russell, co-author of the standard textbook on artificial intelligence.

But how, ex­actly, does one im­part mor­als to a ro­bot? Simply pro­gram rules in­to its brain? Send it to obed­i­ence class? Play it old epis­odes of Ses­ame Street?

He believes that the survival of our species may depend on instilling values in AI, but doing so could also ensure harmonious robo-relations in more prosaic settings. “A domestic robot, for example, will have to know that you value your cat,” he says, “and that the cat is not something that can be put in the oven for dinner just because the fridge is empty.”

But how, exactly, does one impart morals to a robot? Simply program rules into its brain? Send it to obedience class? Play it old episodes of Sesame Street?

While roboticists and engineers at Berkeley and elsewhere grapple with that challenge, others caution that doing so could be a double-edged sword. While it might mean better, safer machines, it may also introduce a slew of ethical and legal issues that humanity has never faced before—perhaps even triggering a crisis over what it means to be human.

The notion that human/robot relations might prove tricky is nothing new. In 1947, science fiction author Isaac Asimov introduced his Three Laws of Robotics in the short story collection I, Robot, a simple set of guidelines for good robot behavior. 1) Don’t harm human beings, 2) Obey human orders, and 3) Protect your own existence. Asimov’s robots adhere strictly to the laws and yet, hampered by their rigid robot brains, become mired in seemingly unresolvable moral dilemmas. In one story, a robot tells a woman that a certain man loves her (he doesn’t), because the truth might her feelings, which the robot understands as a violation of the first law. To avoid breaking her heart, the robot broke her trust, traumatizing her in the process and thus violating the first law anyway.

The conundrum ultimately drives the robot insane.

Although a literary device, Asimov’s rules have remained a jumping off point for serious discussions about robot morality, serving as a reminder that even a clear, logical set of rules may fail when interpreted by minds different from our own.  

Recently, the question of how robots might navigate our world has drawn new interest, spurred in part by accelerating advances in AI technology. With so-called “strong AI” seemingly close at hand, robot morality has emerged as a growing field, attracting scholars from philosophy, human rights, ethics, psychology, law, and theology. Research institutes have sprung up focused on the topic. Elon Musk, founder of Tesla Motors, recently pledged $10 million toward research ensuring “friendly AI.” There’s been a flurry of books, numerous symposiums, and even a conference about autonomous weapons at the United Nations this April.

The public conversation took on a new urgency last December when Stephen Hawking announced that the development of super-intelligent AI “could spell the end of the human race.” An ever-growing list of experts, including Bill Gates, Steve Wozniak and Berkeley’s Russell, now warn that robots might threaten our existence.

Their concern has focused on “the singularity,” the theoretical moment when machine intelligence surpasses our own. Such machines could defy human control, the argument goes, and lacking morality, could use their superior intellects to extinguish humanity.

Ideally, robots with human-level intelligence will need human-level morality as a check against bad behavior.

However, as Russell’s example of the cat-cooking domestic robot illustrates, machines would not necessarily need to be brilliant to cause trouble. In the near term we are likely to interact with somewhat simpler machines, and those too, argues Colin Allen, will benefit from moral sensitivity. Professor Allen teaches cognitive science and history of philosophy of science at Indiana University at Bloomington. “The immediate issue,” he says, “is not perfectly replicating human morality, but rather making machines that are more sensitive to ethically important aspects of what they’re doing.”

Ima­gine we pro­grammed an auto­mated car to nev­er break the speed lim­it. “That might seem like a good idea un­til you’re in the back seat bleed­ing to death.”

And it’s not merely a matter of limiting bad robot behavior. Ethical sensitivity, Allen says, could make robots better, more effective tools. For example, imagine we programmed an automated car to never break the speed limit. “That might seem like a good idea,” he says, “until you’re in the back seat bleeding to death. You might be shouting, ‘Bloody well break the speed limit!’ but the car responds, ‘Sorry, I can’t do that.’ We might want the car to break the rules if something worse will happen if it doesn’t. We want machines to be more flexible.”

As machines get smarter and more autonomous, Allen and Russell agree that they will require increasingly sophisticated moral capabilities. The ultimate goal, Russell says, is to develop robots “that extend our will and our capability to realize whatever it is we dream.” But before machines can support the realization of our dreams, they must be able to understand our values, or at least act in accordance with them.

Which brings us to the first colossal hurdle: There is no agreed upon universal set of human morals. Morality is culturally specific, continually evolving, and eternally debated. If robots are to live by an ethical code, where will it come from? What will it consist of? Who decides? Leaving those mind-bending questions for philosophers and ethicists, roboticists must wrangle with an exceedingly complex challenge of their own: How to put human morals into the mind of a machine.

There are a few ways to tackle the problem, says Allen, co-author of the book Moral Machines: Teaching Robots Right From Wrong. The most direct method is to program explicit rules for behavior into the robot’s software—the top-down approach. The rules could be concrete, such as the Ten Commandments or Asimov’s Three Laws of Robotics; or they could be more theoretical, like Kant’s categorical imperative or utilitarian ethics. What is important is that the machine is given hard-coded guidelines upon which to base its decision-making.

The appeal here is that the engineer retains control over what the robot knows and doesn’t know. But the top-down approach may have some serious weaknesses. Allen believes that a robot using such a system may face too great a computational burden when making quick decisions in the real world. Using Asimov’s first rule (don’t harm humans) as an example, Allen explains, “To compute whether or not a given action actually harms a human requires being able to compute all of the consequences of the action out into the distant future.”

So imagine an elder-care robot assigned the task of getting grandpa to take his meds. The trouble is, grandpa doesn’t want to. The robot has to determine what will cause greater harm: allowing him to skip a dose, or forcing him to take meds against his will. A true reckoning would require the robot to account for all the possible consequences of each choice and then the consequences of those consequences and so on, stretching off into the unknown.

Additionally, as the lying robot from Asimov’s story demonstrates, rigid adherence to ethical rules tends to lead to moral dilemmas. What’s a robot to do if every available course of action leads to human harm?

It’s great storytelling fodder, but a real-life headache for roboticists.

Stuart Russell sees another weakness. “I think trying to program in values directly is too likely to leave something out that would create a loophole,” he says. “And just like loopholes in tax law, everyone just jumps through and blows up your system.”

Since having our system blown up by robots is best left to Hollywood, an alternative called the bottom-up approach may be preferable. The machine is not spoon-fed a list of rules to follow, but rather learns from experience.

The idea is that the robot responds to a given situation with habituated actions, much like we do. When we meet a new person, we don’t stop and consult an internal rulebook in order to determine whether the appropriate greeting is a handshake or a punch to the face. We just smile and extend our hand, a reactive response based on years of practice and training. “Aristotle said that the way to become a good person is to practice doing good things,” says Allen, and this may be the best way to become a good robot too.

Ro­bots “could learn what makes people happy, what makes them sad, what they do to get put in jail, what they do to win medals.”

The bottom-up strategy puts far less computational strain on the robot because instead of computing all the possible “butterfly effect” repercussions of each action—whether or not a human might someday, somehow be harmed—the machine simply acts on its habituated responses. And this could lead to organic development of moral behavior.

But the bottom-up approach requires robots that can learn, and unlike humans they don’t start out that way. Thankfully the field of machine learning has taken great leaps forward of late, due in no small part to work being done at Berkeley. Roboticists have had success using reinforcement learning (think “good robot”/”bad robot”), but Russell invented another technique called inverse reinforcement learning, which takes things a step further. Using Russell’s method, a robot observes the behavior of some other entity (such as a human or even another robot), and rather than simply emulating the actions, it tries to figure out what the underlying objective is.

In this way the machine learns like a child. Imagine a child watching a baseball player swinging a bat, for example. Quickly she will decipher the intent behind the motions: the player is trying to hit the ball. Without intent, the motions are meaningless—just a guy waving a piece of wood.

In a lab down the hall from Russell’s office, Berkeley professor Pieter Abbeel has used “apprenticeship learning” (a form of inverse reinforcement learning) to give BRETT, the resident robot, the ability to learn how to tie knots, connect LEGO blocks, and twist the tops off water bottles. They are humble skills, to be sure, but the potential for more complex tasks is what excites Abbeel. He believes that one day robots may use apprenticeship learning to do most anything humans can.

Crucially, Russell thinks that this approach could allow robots to learn human morality. How? By gorging on human media. Movies, novels, news programs, TV shows—our entire collective creative output constitutes a massive treasure trove of information about what humans value. If robots were given the capability to access that information, Russell says, “they could learn what makes people happy, what makes them sad, what they do to get put in jail, what they do to win medals.”    

Get ready to install the robot filter on your TV.

He’s now trying to develop a way to allow machines to understand natural human language. With such a capability robots could read text and, more importantly, understand it.

The top-down and bottom-up techniques each have their advantages, and Allen believes that the best approach to creating a virtuous robot may turn out to be a combination of both.

Even though our best hope for friendly robots may be to instill in them our values, some worry about the ethical and legal implications of sharing our world with such machines.  

“We are entering a whole new kind of ecosystem,” says John Sullins, a Sonoma State philosophy professor specializing in computer ethics, AI, and robotics. “We will have to reboot our thinking on ethics, take the stuff from the past that still works, and remix it for this new world that’s going to include new kinds of agents.”

What about our human tendency to regard machines as if they possess human personalities?

“We already do it with our cars. We do it with our computers,” Sullins says. “We give agency inappropriately and promiscuously to lots of things in our environment, and robots will fool us even more.”

“So the per­son is now one-click buy­ing a bunch of crap from Amazon just to main­tain this sham friend­ship with a ma­chine. You get a lonely enough per­son and a clev­er enough ro­bot and you’re off to the bank.”

His concern is bolstered by a 2013 study out of University of Washington that showed that some soldiers working alongside bomb-diffusing robots became emotionally attached to them, and even despairing when their robots were destroyed. The danger, Sullins says, is that our tendency to anthropomorphize could leave us vulnerable. Humans are likely to place too much trust in human-like machines, assuming higher moral capability than the machines actually have. This could provide for-profit robotics companies an “opportunity to manipulate users in new and nefarious ways.”

He offers the example of a charming companion robot that asks its owner to purchase things as a condition of its friendship. “So the person is now one-click buying a bunch of crap from Amazon just to maintain this sham friendship with a machine,” he says. “You get a lonely enough person and a clever enough robot and you’re off to the bank.”

Then there’s the question of how we would define such robots. Would they be things? Beings? “Every roboticist has a different answer,” Sullins says. “What we’re talking about is a new category that’s going to include a wide range of technologies from your intelligent thermostat to R2D2—and everything in between.”

Sullins believes the arrival of these new robotic beings is going to throw ethics for a loop. “For thousands of years we’ve been kind of sleep walking through what morality and ethics means because we just assumed that the world was all about us, all about human relationships,” he says. “The modern world is calling that into deep question.”

And the ramifications won’t just be ethical, but also legal.

Ryan Calo, a law professor at the University of Washington specializing in cyber law and robotics, believes that moral machines will have a deeply unsettling effect on our legal system.

With the pervasion of the Internet into every aspect of our lives “we’ve grown accustomed to this promiscuous, loosey-goosey information eco-system in which it can be difficult to establish liability,” says Calo. “That’s going to change when it’s bones and not bits on the line. We’re going to have to strike a different balance when software can touch you.”

Calo serves on the advisory committee of the new People and Robots Initiative of CITRIS, the University of California-wide technology research center. He believes that the ability of robots to physically impact the world is just one of several issues legal experts will have to grapple with.

Last year two Swiss artists cre­ated an al­gorithm that pur­chased items at ran­dom from the In­ter­net. The al­gorithm even­tu­ally bought a few tab­lets of the il­leg­al drug Ec­stasy, and Swiss po­lice, un­cer­tain how to re­act, “ar­res­ted” the al­gorithm.

For instance, the law will have to confront what he calls “emergent behavior” meaning complex actions that are not easily predicted—even by a robot’s own developers. He gives the example of two Swiss artists who created an algorithm that purchased items at random from the Internet last year. The algorithm eventually bought a few tablets of the illegal drug Ecstasy, and Swiss police, uncertain how to react, “arrested” the algorithm.

Even if robots can one day make decisions based on ethical criteria, that does not guarantee their behavior will be predictable.

Another issue he calls social valence: the fact that robots feel like people to us. This raises numerous issues, for instance “How should privacy law react when everything around us, in our homes and hospital rooms and offices and cars, have things that feel like they’re people? Will we ever really be alone? Will we ever experience solitude?”

This might also lead to the extension of certain rights to robots, Calo argues, and even the prosecution of those who abuse them. “Should we bring scrutiny to bear on people who do things like ‘Torture-Me Elmo’?” he asks, referring to a spate of Youtube videos depicting Tickle-Me Elmo dolls that are doused with gasoline and burned as they disturbingly writhe and giggle. “Nobody cares when you dump your old TV on the street. How will they feel when you dump your old robot?”

The effect on the law will be exponentially more dramatic, Calo says, if we ever do develop super-intelligent artificial moral agents.

“If we do that, it’s going to break everything,” he says. “It’s going to be a fundamental sea change in the way we think about human rights.”

Calo illustrates the sort of dilemma that could arise using a theoretical situation he calls the “copy-or-vote paradox.” Imagine that one day an artificially intelligent machine claims that it is a person, that it is sentient, has dignity, experiences joy and pain—and we can’t disprove it. It may be difficult to justify denying it all the human rights enjoyed by everyone else. What happens if that machine then claims entitlement to suffrage and procreation, both of which are considered fundamental human rights in our society? And what if the machine procreates by copying itself indefinitely? Our democracy would come undone if there were an entity that could both vote and make limitless copies of itself.

“Once you challenge the assumption that human beings are biological, that they live and they die, you get into this place where all kinds of assumptions that are deeply held by the law become unraveled,” Calo says. 

Despite their warnings, both Calo and Sullins believe there is reason to be hopeful that if enough thought is put into these problems they can be solved.

“The best potential future is one in which we utilize the strengths of both humans and machines and integrate them in an ethical way,” Sullins says.

There is another potential future imagined by some enthusiastic futurists in which robots do not destroy us, but rather surpass our wildest expectations. Not only are they more intelligent than us, but more ethical. They are like us—only much, much better. Humans perfected. Imagine a robot police officer that never racially profiles and a robot judge that take fairness to its zenith. Imagine an elder-care robot that never allows grandpa to feel neglected (and somehow always convinces him to take his pills) or a friend who never tires of listening to your complaints.  With their big brains and even bigger hearts, such robots could solve all the world’s problems while we stare at our belly buttons.

But where does that future leave us? Who are we if robots surpass us in every respect? What, then, are humans even for?

As roboticist Hans Moravec once wrote, “life may seem pointless if we are fated to spend it staring stupidly at our ultra-intelligent progeny as they try to describe their ever more spectacular discoveries in baby-talk we can understand.”

Sullins has another vision, one in which humans at least have an active role:

“These machines are going to need us as much as we need them because we have a kind of natural nuance that technology seems to lack. A friend of mine used to liken it to a bird and a 747. The plane can get you across the planet in hours, but it certainly can’t land on a treetop. These machines will be good at taking us places quickly, but once we get there, the nuanced interactions, the art of life, that’s going to take our kind of brains.”

Share this article:
Google+ Reddit


I think at some point we’re going to see diminishing returns with brute force methods and start using more Hofstadterian investigations of analogy and other mechanisms of thought that will lead to machines that not only have a sense of themselves, and an ability to think fluidly, but are possessed of their own individual styles of responses. I think then that Sullis’ will find his role-for-humans stripped away. Anyways, we really need to think about how to uplift ourselves, I think, to be as clever as our robots, if we’re to keep talking to them intelligently instead of having everything explained and done for us. The question there then is what happens to an organic intelligence when it is scaled up in novel directions using synthetic means, and whether, regardless of whether we can keep machines moral, we can keep ourselves sufficiently moral, or whether we’ll find ourselves tiling the universe in paperclips instead of our machines.
As there is no software without a possibility to configure it what are the implications of this? Already our human morality has different “modes” how much we can hurt a human - different when caring for a child, an elderly, or defending a child or acting as a police officer, acting as a soldier when killing suddenly becomes necessary. There will be no “don’t hurt physically” automatism. Law enforcement agencies and military will and somewhat must insist on a different moral mode/configuration for their automatic tools. So how about the potential for robots and their programmers to also teach them the “bad”, to hurt humans. How will we deal with the legal implications when a robot house guard does not hurt/kill a thief at 0400 in the morning at our garden door but the neighbour who seeks help in an emergency? And below this why should there be no “success” mode which also we humans use a lot of times fostering our careers with all sorts of dirty tricks and bullying?
“His concern is bolstered by a 2013 study out of University of Washington that showed that some soldiers working alongside bomb-diffusing robots became emotionally attached to them, and even despairing when their robots were destroyed.” McDonald is referring to my dissertation, “The Quiet Professional: An investigation of U.S. military Explosive Ordnance Disposal personnel interactions with everyday field robots,” (2013, University of Washington). Thank you for the nod! - JC
After seeing a most Lifelike Nude Female sculpture at the SOFA Show in Chicago, about 25 years ago, I have measured the time it would take to create a Vacuum Cleaner that looked exactly like the Young Brigit Bardot. Reading this is just scratching the surface. I suppose I would begin by programming the Robots with the Constitution, all the laws and sentencing guidelines and put them to work in the Judicial System, as Judges. Then let’s see how fair that is, and how it shakes out.
I have to ask that someone think of an off switch. Sorry, Data, you’re dead. I believe time will be an enemy to robots as well. If they don’t have a lifespan based on the genetic time bomb we apparently all possess, then rust and lack of upkeep should bring them down. Hopefully they won’t experience any dementia as they deteriorate. And if anyone suggests an operating system designed by Google, just kill me now.
The first job is to make natural language more rational by having computers that can identify meaning by recording possible word senses and intended references to the language usages of the author & others. The key is to control ambiguity thru interactions among authors & responders. When we start to appreciate how little people really are precise in their understanding of their own language and that of others we will start to rationalize what human intelligence is and how to communicate successfully.
The “Frame Problem” suggests that this is impossible. Do you have a way around it?
The frame problem only suggests that human level intelligence is difficult, to me. human thought seems to be just as impacted by difficulties in selecting what not to think about and what to think about, and a lot of our everyday mistakes reflect this. We are /better at it/ than machines. I see no evidence that humans have conquered, in an absolute sense, this problem of selecting what to think about.
It is a good review… I look for some ideas to use in my work, I do a historical study about education, especially for implement the knowlenge about social response and a colateral way the idea about history. I look interesting your article, there are many works about it, for example: Hero of Alexandria in his book the “automaton”, tell us about old mechanic but also compare a machine with a person.
@ NOVA - The frame problem is not an issue for neural networks because only relevant, context-sensitive pathways are activated in the first place. Thus, humans do not have to determine context and relevance since that’s precisely what networks do. The implicitly filter content by simply not activating irrelevant circuitry, imperfectly of course, but still. I have seen some strong AI attempts to overcome the frame problem, but none of them successfully, at least to me.
ROBOTs are the next generation machines which can work as human. The world is expecting to see and experience ASIMOV which resembles a human. Very much thanks to our technologies and the father, Isaac Asimov, who proposed it. Please visit to get a review and dissertation on the research and review about Isaac Asimov’s book “I, ROBOT”.
I believe that the best way to prevent humans from becoming irrelevant, while making sure that robots have the same values that we do, is to “fuse” humans and robots (i.e. uploading our consciousness into a machine) instead of creating independent entities. Otherwise I don’t believe it would take them very long to realize that we are more of a threat than a friend.
ROBOTs are going to change our life totally. ROBOTs are the next generation machines which can work as human. ... I, ROBOT is a good book.
This website is very helpful on my essay on robots of the future.
my few thoughts … Algorithmic Game Theory mainly N player games and Mean field. Behavioral economics. to solve for socially optimal equilibrium. thinking of 1000’s of robots doing things. it can learn from millions of social interactions on internet, using mean field game theory, and use anomaly based active learning.
Intelligence does not equal life. Nor does emotions. Life is a human spirit that last forever and has a mind and emotions. The bible teaches morales and values and free will. It teaches us kindess to animals and respect for the earth and stewardship of all that is given to us. Robots will never have a spirit but just like any invention it will be used for good and evil. Even programming it for good character will not keep humans bent on evil deeds to reprogram it to do evil for them. Don’t we need to always think instead; what evil could this invention do in the hands of an evil person. The telephone used in the right way brings an ambulance but in the wrong hands it can be used to plot a killing.

Add new comment