Findings

How to read the future

Telling the future with Twitter

Gregory Nemec

Gregory Nemec

View full image

Even as the world becomes more dependent on Big Data, Big Data is becoming more elusive, at least when it comes to social media and other forms of online information. The explosive growth of online data is making full-scale analysis increasingly difficult; meanwhile, privacy concerns are restricting access to information. But if you choose carefully, says Professor Nicholas A. Christakis ’84, you can find small data sets that will provide reliable information on trends—before they’ve become trends.

Working with an international team, Christakis, the Sol Goldman Family Professor of Social and Natural Science and codirector of the Yale Institute for Network Science, analyzed six months of Twitter data recorded in 2009. The researchers picked a small set of Twitter users at random as the control group. To form the study group, they randomly chose Twitter users from among the followers of the control group. Randomly chosen “friends,” Christakis explains, are more popular than randomly chosen individuals. “These popular people have more connections in a system than people at random,” says Christakis. “We discovered that they, like canaries in a coal mine, could serve as sensors who could provide early warning signals of what would be popular on Twitter well in advance of when trends became obvious.”

The team then chose 200 hashtags and monitored the number that went viral during the period under study. They expected that the “sensor” group might have started using these hashtags a few hours before the control group; instead, the sensor group’s usage predicted viral outbreaks an average of nine days ahead of the control group’s. In some cases the lead time was dramatic. The model predicted the ascendancy of “#Obamacare” a full two months before it became a Twitter phenomenon.

The technique, Christakis says, may be powerful enough to provide “improved detection of such important social contagion phenomena as epidemics, movements in the stock market, consumer behavior, and political movements.”

The comment period has expired.