In his latest New Yorker article, Malcolm Gladwell profiles a group of shady entrepreneurs who claim to have devised an algorithm that can predict which movies will become blockbusters. They simply "interpret" the script, breaking it down into a discrete list of variables, and then plug those variables into their mainframe. A few hours later, a prediction pops out. Voila.
Does such a program have a chance of working? I'm doubtful, and not only because Gladwell never reveals their statistical rate of success. Instead, we learn about a few of their "uncanny" successes: they correctly predicted that "The Interpreter" would bomb, and that "Lethal Weapon" would be huge. Hollywood releases over 500 movies a year: it would be shocking if an algorithm wasn't occasionally exactly right.
And I'm not doubtful because similar algorithms haven't been useful elsewhere. For example, back in the 70's Dr. Lee Goldman put together a very simple decision tree that helped ER doctors diagnose heart attacks. (You'd think a heart attack would be easy to diagnose, but it isn't: up to 8% of heart attack patients are sent home by their doctor. An even larger percentage of patients are treated for heart attacks that didn't actually happen.) When doctors are left to make their own decisions, they are right about 80% of the time. However, when they follow Goldman's simple algorithm, they were right more than 95% of the time.
Why are doctors so bad at diagnosing heart attacks? Because they take too much information into account. As Daniel Kahneman and Amos Tversky first observed in 1974, in their seminal paper "Judgment under uncertainty: Heuristics and Biases," irrelevant information can often to decision making mistakes. This is why Goldman's heart attack algorithm worked so well: instead of taking all the "relevant" factors into account (patient history, amount of pain, etc.) this algorithm looked at four factors; ECG, blood pressure, fluid in the lungs, and unstable angina. That was it. Decisions dramatically improved when excess information was excluded.
As Gladwell notes in his article, similar algorithms have been used to profitable effect in dog racing. Hedge funds try to use these algorithms to predict currency swings. Billy Beane mined statistics to build his baseball roster. The list goes on and on. So why don't I think these algorithms - Gladwell calls them artificial neural networks - will reliably predict the success of Hollywood movies? The big obstacle comes during the step Gladwell glosses over: the act of interpretation. When you are measuring someone's blood pressure, the information is obvious. No hermeneutics required. However, a movie script - or any equivalent art work - isn't like that. For one thing, language is slippery and imprecise. People don't agree on whether a character is believable or not, or whether a plot is compelling. While this Hollywood algorithm tries to get around these subjective obstacles, there is no way their interpretation of a script will generate a set of information that is as accurate as those generated by a few simple medical tests. In fact, when you look at all the algorithms that have proven effective - baseball, financial markets, dog racing, etc. - it has been in areas that have an enormous amount of accurate statistical information that doesn't require interpretation. Nobody has to decipher a hitter's on base percentage.
So I'm skeptical that movies will ever become quantifiable. Judging the sucess of blockbusters is notoriously impossible, but I'm guessing that a computer won't save us. Too much information is bad. But so is flawed information, especially when it's all you've got.
PS. That said, Gladwell's article is gorgeously written, and absolutely worth a read.
- Log in to post comments
Hawking to star in Big Bang, the movie