The $1 million contest for the Netflix Prize ended Sunday with two teams in the computing version of a photo finish. The outcome, it seems, remains in some doubt.
The competition, begun in 2006, really shifted into high gear in the last month, after an international team of statisticians, machine learning experts and computer engineers declared that they had come up with algorithms that could improve the movie recommendations made by Netflix’s internal software, Cinematch, by at least 10 percent.
The announcement by that team, BellKor’s Pragmatic Chaos, set off a 30-day race, under the contest rules, for other teams to try to best them. Entries flowed fast and furious since then. And just minutes before the Sunday deadline, a team composed of other former teams, appropriately called the Ensemble, nudged ahead of BellKor’s, on the public Web leaderboard.
So The Ensemble won, right? Not necessarily. In an e-mail message Sunday night, Chris Volinsky, a scientist at AT&T Research and a leader of the BellKor’s team, said: “Our team is in first place as we were contacted by Netflix to validate our entry.” And in an online forum, another member of the BellKor team, Yehuda Koren, a researcher for Yahoo in Israel, said his team had “a better test score than the Ensemble,” despite what the rival team submitted for the leader board.
So is BellKor the winner? Certainly not yet, according to a Netflix spokesman, Steve Swasey. “There is no winner,” he said.
A winner, Mr. Swasey said, will probably not be announced until sometime in September at an event hosted by Reed Hastings, Netflix’s chief executive. The movie rental company is not holding off for maximum P.R. effect, Mr. Swasey said, but because the winner has not yet been determined.
The Web leader board, he explained, is based on what the teams submitted. Next, Netflix’s in-house researchers and outside experts have to validate the teams’ submissions, poring over the submitted code, design documents and other materials. “This is really complex stuff,” Mr. Swasey said.
A leading member of the Ensemble, Domonkos Tikk, a Hungarian computer scientist, did not sound too hopeful. “We didn’t get any notification from Netflix,” Mr. Tikk said in a phone interview from Hungary. “So I think the chances that we won are very slight. It was a nice try.”
17 Comments
Without any doubt whatsoever the losing will sue.
— Stephen AndrusI don’t think customers really care…I know I am perfectly capable of deciding what movies I want to see. While Netflix is certainly helpful with its recommendation, I don’t think a 10% improvement is really going to affect the quality of my movie-rental experience that much.
— CThis is journalistic reporting at its worst. The author assumes that the reader is already fully aware of the Netflix contest by giving out just the barest of details about it - only focusing on the outcome. Oh wait - I take it back…the author provides links to stories that were already written. LINKS! Since when is a link to another story considered a journalistic tool for a publication such as the New York Times? Lazy, lazy!!
— Abbieas long as netflix is not offering a meaningful porno selection, i can’t say how this algorithm will improve MY experience
— Mikey Likes ItThat million dollars would have been better spent sending Hollywood’s studio chiefs back to film school.
— JamesI think that it would impress Netflix’s advertisers more than their customers.
— alboyjrNow I know how Netflix is using the Blu-Ray surcharge they are gouging me with.
— AlanThe Netflix contest has certainly focused the effort of algorithm engineers, but has it really achieved an improvement in user recommendation experience? The underlying assumption in the contest is that the algorithm can be used and scaled by Netflix to predict what their users will like for movies which they haven’t seen–usually considered the most interesting problem for recommendation systems. But based on the methodologies of the contest and the teams, it’s not clear that this will be the case. The danger of Netflix’s narrow definition of what constitutes “recommendations” is that you end up with a decent system for predicting how a user will rate a movie on a 1-5 scale.
Many open questions remain, such as:
How is the ability to predict viewer ratings (movies already seen) different from the problem of movie discovery (movies not yet seen)?
Will a 10% increase in Cinematch actually translate into noticeable product improvements for Netflix subscribers?
Does the Netflix contest ask a question with direct relevance to Netflix’s overall business goals and financial bottom-line?
Are 1-5 star ratings objective? Are there better ways of soliciting input from viewers that would yield more meaningful data with less tedium?
/Michael Papish
— michael papishCEO, MediaUnbound
Haha. (abbie) Well, actually the readers do know what they’re talking about because we follow tech news on this biz/tech blog. Also, links are the ubiquitous currency of the internet, it seems to be the appropriate tools for a blog post. Blogs aren’t meant to be long Sunday magazine articles.
— tamonlineI hope Netflix puts the algorithm to work as soon as possible. If I have to check Not Interested on one more Katt Williams video, I’m going to lose it.
— SatoricalThe claim by a commenter that this is “journalistic reporting at its worst” is ridiculous.
This is a blog entry, not an article. Different standards apply, as blogs are by convention briefer and more informal than articles.
For an ideal comparison, just read the companion article to this blog, written by the same author:
— Jianhttp://www.nytimes.com/2009/07/28/technology/internet/28netflix.html
On their forum, Netflix states that both teams met the minimum requirement to win the prize. The actual performance difference on the “test” set is probably around 0.01% - statistically insignificant. This is a TIE!
— InsiderThey should do this for dating sites to match people and save billions of dollars from unnecessary divorces and broken hearts and negatively affected children.
Who cares about movie recommendations?
— Miss UnderstoodI’m encouraged by the exceptional quality of work that the teams put in, no doubt fanned by not only fame, but spoils of $1 mil certainly doesn’t hurt.
We’ve hurt so many times before, about the power (or wisdom or whatever) of the crowd. What we mostly see from are simply silly, silly stuffs (eg. star gazing and celebrity chasing on twitter, highly opinionated but otherwise useless blogs). That’s why this netflix competition is such a breath of fresh air.
To top it off, netflix doesn’t skim it like google does. Afterall, netflix will very likely use the winner’s algo. Remember what google does, telling illustration artists to submit their work to be used by google as wallpaper for free? Now, THAT is what I call cheap.
— tiddleThey should do the same for the congress, to see if we improve! : )
— ChrisMike of [generic media company] wrote:
“Does the Netflix contest ask a question with direct relevance to Netflix’s overall business goals and financial bottom-line?”
Yes, because the system is not only about recommending better choices, it’s about *not* recommending relatively worse choices. A customer gets half-way through a movie that doesn’t really interest him, decides he’ll get back to the rest when he can, ultimately sends the DVD back, unfinished. But that may be two weeks later, during which time that copy is otherwise unavailable, and Netflix still has to eat the postage.
The good recs are still good recs. The real payoff comes with reducing the bad recs. That customer above would have been just as happy with 9 movies as with 10, given the 10th was (for him) a clunker. (Or yeah, an improved rec for the 10th movie would make him happier.)
— YashaTHE LOSER WILL WIN — more than $1M. Much like the winner of American Idol is tied into a so-so contract, the winner of this contest must make all their algorithms publicly available. They can license them to non-netflix firms but that assumes they can turn them into patents or other protected IP. The “loser” does not have to reveal their “secrets” (although there is certainly significant overlap w/ the winning algorithms) and pawn them off as “trade secrets” or even keep them in-house and certainly get a few major contracts.
In the end I won’t be surprised if Netflix gives both teams $1M (but will the second team need to reveal its codes?)
Paul Hamilton
— pvh