Why a 12-year-old forecasting paper has stood the test of time

A. 

We used the machine learning of 12, 13 years ago, which has been greatly surpassed, so I think that has been less of a contribution. But if you have some machine learning model, you need to be able to validate it — if it's a supervised-learning problem, say, to figure out how good the labeling process is. But how do you evaluate an algorithm that gives warnings about the future?

We were supposed to give warnings that, for example, these political parties would increase their vote share the most in the coming election in this district. Suppose we're partly correct: what kind of score should we give it? Maybe one of those parties heavily advanced its vote share, but the other did not.

One insight that I think has stood the test of time is to not look at individual warnings but to look at a whole temporal sequence of warnings and another temporal sequence of actual events and try to match these up in as coherent a manner as possible. This is called a bipartite matching, and you find a maximum-weight matching between warnings and alerts so that you get a coherent idea that this event can be most attributed to that warning and vice versa. In order to respect temporal considerations, you would ideally like the matching to be non-crossing.

Three approaches to matching warnings (w1–w7) and events (e1–e7): weighted matching; maximum-weight bipartite matching; and non-crossing maximum-weight bipartite matching. Figure adapted from “‘Beating the news’ with EMBERS: Forecasting civil unrest using open-source indicators”.

A second contribution that I think has been useful is issuing not just one alarm but multiple alarms, because we have complementary algorithms with different skills. But you don't want a cacophony of alarms. You want to somehow fuse them, and our fusion methodology uses some simple Bayesian ideas about our prior belief about the different strengths of our algorithms, the costs of using them, and so on. And of course, you keep updating them according to Bayes’s rule.

Comments (0)

AI Article