Do “preference algorithms” really know you?
It’s 1983. A young woman looks in a store window and sees a new book from an author she likes. She wanders in and leafs through it. A store clerk gives his opinion. Another customer leans over with her recommendation. The woman buys the book and leaves without a trace.
Fast forward to 1999. This time, the woman reads a magazine article about a new book from a favorite author. She then goes online to read other reviews. She debates whether she’ll have time go to the bookstore to buy it, and decides instead to click the convenient link someone posted and buy the book at Amazon.com.
Several years later, she has become so comfortable with ordering books—and music and boots—on the Internet that quite a few companies have taken note of what she likes.
In exchange for the convenience of having goods delivered to her door, this shopper has bartered her personal information. Soon, the stores and organizations she frequents online know enough about her predilection for a certain kind of book—and song, and boot—that they can recommend something to her and tell her how to buy it. (The bookstore has long since closed.)
Finding what we want, at least online, is now just this easy thanks to what are called “recommender systems.” More and more companies, from Hulu for television, Spotify for music, and Match.com for dates, to Amazon for everything, use algorithms to predict what we are looking for, and as their use becomes more prevalent, our computer seems to know what we want before we know it ourselves.
Where the first phase of the Internet was about access to a vast sea of information, and Web 2.0 was about the growth of social media, this new personalized version of the Internet—occasionally called Web 3.0—aims to help users navigate that ever-rising sea of data. Recommender systems not only offer us navigation tools, they promise us calm seas and a happy landfall, with all the hard work of narrowing our choices done for us.
Marketers see recommender systems’ potential to gauge consumer tastes and more cheaply sell goods and services. But by identifying what a customer likes and offering up more of the same, the sellers run the risk of defining taste in a very limited way. And for the consumer there is the uncomfortable question: Is it my personal tastes or is it the Internet taste engines that are driving my buying decisions?
Google has for years used similar technology to post individualized ads based on your previous search habits, and all sorts of organizations now use personal information to help forecast users’ taste. When launched in 2005, the Pandora method of picking tunes based on current preferences caught on with music aficionados. Other companies figured out that they could aggregate all kinds of personal data to predict what people might want.
Most “taste algorithms” fall into two categories: content-based and unconstrained. To develop Pandora’s content-based algorithm, a real live staff person has listened to each song in its database and analyzed it for certain components in order to put it into a category, such as “songs featuring an acoustic guitar” or “songs that shift from minor key to major key.” If a user says she likes a certain song or musician, Pandora will point her to other songs with similar characteristics, which doesn’t necessarily broaden her musical horizons.
“Essentially you get into a smaller and smaller pool that shares those specific requirements,” said Clayton R. Critcher, assistant professor of marketing at Berkeley’s Haas School of Business, who studies recommender systems.
In contrast to Pandora’s technique, Amazon and Netflix use unconstrained algorithms to create recommendations based on the preferences of other users with similar taste, known as “collaborative filtering.” If you liked A Room with a View, Netflix won’t send you only to costume dramas featuring English people acting foolish while falling in love in Italy. It will send you to other movies enjoyed by a certain number of people who loved A Room with a View—even the occasional action movie.
In 2006, Netflix famously offered a $1 million prize to the team that was able to improve its recommender algorithm by 10 percent. Unconstrained taste engines like Netflix’s tend to be the most accurate, according to Critcher, “even if they can’t come up with the exact reason the predictions are there.”
But not all prediction engines are that effective at nudging people outside their comfort zone. Eugenio Tacchini, a Ph.D. candidate specializing in recommender systems at the University of Milan, Italy, thinks most of the systems rely too much on what people have liked in the past. “They usually don’t take into consideration that taste evolves with time. In particular they don’t usually stimulate a user’s latent interest. A user can be very focused on heavy metal, but he may have a potential for interest in jazz,” said Tacchini.
Tacchini is working on a system that he hopes will broaden people’s appreciation of music to unexpected genres or styles that may appear to be dramatically outside of their taste. Now a visiting scholar at Berkeley’s Laboratory for Automation Science and Engineering, he is working under its director Ken Goldberg to develop a collaborative filtering system for music, using data from Last.fm, a London-based music service similar to Pandora. Tacchini has so far classified 3,000 musicians into categories he calls “musical worlds”; his goal is to categorize 30,000 artists in all.
He has grouped music into 36 musical worlds, using classifications more refined than those you’d find in a typical record store. For instance, he has two separate hip-hop groupings. “It seemed like a duplication, but if you look inside the list you realize that some of them are much more indie and the others are more mainstream. Using collaborative filtering, we found that they were different because they had different audiences,” he said. With that knowledge, his system may be able to point people who have other musical tastes in common with the indie hip-hop fans to that particular subgenre of music, even if they never were interested in mainstream hip-hop.
If Tacchini’s work suggests that a taste engine can cross genres, what about products? Is there some commonality that suggests that just because we like apples, we may be more likely to also buy, say, red boots? Yes, says the New York firm, Hunch.
Hunch brings its own algorithms and consumer research together with social media recommendations in what it calls a “taste graph” covering a wide range of interest areas and people. Thanks to relationships with Facebook and Twitter, among others, Hunch can predict the taste preferences of 500 million people across 200 million items, said Hunch VP of business development, Shaival Shah ’97.
Hunch gathers its data in different ways. Users who log on to Hunch.com are asked to demonstrate their choices in everything from politics to personal grooming by answering a number of questions, such as “Are you a Mac or PC person?” They can also rate and recommend individual consumer products, and the average user answers more than 100 questions, said Shah. But Hunch gets even more data when its users connect to Hunch through Facebook and authorize the company to gather their Facebook “Likes,” as well as those of their Friends.
In return for providing data to Hunch, users receive uncannily accurate predictions for new things they might enjoy, from restaurants to well-designed websites. I’ve received weekly emails from Hunch that often list books I just finished reading or household products I’ve toyed with purchasing. There have also been plenty of surprising items I’ve been happy to learn about, such as Bacsac, a French maker of fabric planters that seem to be used by people who wear oversize sunglasses.
“When you look at only your movie preferences, you can only get so much prediction. Things like fashion interests or book preferences provide a broader perspective,” said Shah.
Offering a random example, he said, “People who tend to love leather also tend to love movies with a dramatic finish that also have a romantic story.” As for the PC people, they read CNN.com, while Mac-identified folks opt for The Huffington Post.
More than 300 companies use Hunch’s data, including Showtime television and the medical research fundraiser Stand Up To Cancer. Shah suggests other possible applications for Hunch in addition to offering clients market research, such as helping stores determine what products to stock based on the demographic and psychographic information that Hunch gathers from a brand’s Facebook fan page or Twitter handle.
He also sees other ways the system can help consumers who have shared their tastes for fashion, art, and even music with Hunch. “Let’s say you go onto a furniture website. You want to buy a piece of furniture, but you’ve never ‘Liked’ a piece of furniture on Facebook,” said Shah. He foresees that Hunch will be able to guide you to items in the store that you’ll like based on your demonstrated taste in other areas.
And we give out all this data about ourselves, our friends, our family in exchange for a more personalized web experience. But the question remains whether these taste engines are reflecting our taste or driving it. “There’s always this tension noted in initial preferences and advice from other people,” Critcher said. Where salespeople once nudged us into a purchase, the online reviews have moved in, he added. Now we have recommendation engines doing the nudging. “We don’t know how this will morph.”
To determine how much these systems are affecting consumer buying habits, Critcher is conducting a study that looks across 40 product categories, from apartments to jewelry. He is hoping to discern when consumers go with their initial hunch, or when they follow predictions from recommender systems or ratings from other users.
“One hunch we have,” he said, is that consumers will trust a prediction when purchasing inexpensive goods. But if the person is shopping for high-end products, he is more likely to make the decision on his own. That may change, however, as people grow to have more trust in predictions from companies.
“One thing that hurts these predictions is how impersonal they are,” said Critcher. Anecdotes from friends are still considered more valuable, he adds. “More depersonalized systems that draw on tons of data have the potential to have the best predictions. But it may take social media to make those more appealing.”
Most predictor engines have been shown to be pretty accurate at determining what you might buy. But as Critcher points out, you might buy something because you were told you would like it. The only way companies can confirm that the recommendation was accurate is when consumers go back and rate the products.
Berkeley Professor Ken Goldberg would like to see a more expansive approach to recommendations, starting with the way users’ opinions and tastes are collected. He is critical, for example, of the prevalent use of star ratings.
“When you’re rating or evaluating something like a book or a movie or a song, you’re doing something that’s a matter of taste. I think it’s not easily pigeonholed into a series of boxes,” said Goldberg. “Matters of taste are almost physiological. It’s literally taste—part of your digestive system. Or we talk about a gut reaction to a song.”
Goldberg’s lab released its first collaborative filtering algorithm in 1999, called Eigentaste, named for the mathematical term eigenvector. He has created a system for jokes, called Jester, and one for charitable causes, called Donation Dashboard. His latest is being used by the Department of State to gather opinions on political and social issues from around the world, in an application called Opinion Space. “We realized we could generalize the idea of rating books and movies to rating almost anything—that it’s possible to rate a statement,” said Goldberg.
Opinion Space, using a type of algorithm similar to that used by Netflix, tries to gather information in an arguably more nuanced way. Someone taking the State Department survey can see his viewpoints represented in a graphical display that looks like a galaxy, with similar opinions clustered together like large stars. He moves his cursor along a slider to express his opinion and can see where it falls in the spectrum. “Making it more physical is a much more natural way of responding and conveying your taste,” said Goldberg, who also has pilots with General Motors, Humana, and Unilever.
Goldberg sees an opportunity for research in finding better ways to predict people’s tastes based on their moods. He uses food as one example. “Some days you’re really up for something super gourmet like French Laundry; other days you want comfort food. You don’t just want one thing,” he said. Until an algorithm can capture that, “it blurs you into this sort of singular personality that doesn’t really match you at all,” said Goldberg.
For now, recommendations on the Internet may continue to create a creeping suspicion that you are not the unique person, with distinctive tastes and ideas, that you thought you were. Goldberg hopes to provide an appealing alternative with gut-level approaches like Opinion Space.
“In all these previous systems, there’s this collective wisdom of crowds, but it’s buried in the system,” he said. “It’s a beautiful spectrum. Instead of being concerned that you get typecast or that you’ve lost individuality, I feel kind of the opposite. It shows an incredibly diverse range.”