We have been following innovative uses for social media in the biotech and healthcare industry here on the blog. Recently, a comprehensive paper was published in PLoS ONE outlining the use of “infoveillance” tools on the web to track the public response to the H1N1 epidemic. Dr. Gunther Eysenbach and Cynthia Chew, both researchers at Toronto’s Centre for Global eHealth Innovation, mined and archived over 2 million Twitter posts between May 1 and December 31, 2009. After carrying out an in depth analysis of these “tweets”, they validated Twitter as an effective medium to capture real-time content, sentiment, and public attention trends. Infoveillance methods include data mining, aggregation, and categorizations of online text and together form the toolkit for the new study of “infodemiology”. In the paper, Eysenbach points out that Twitter is particularly amenable to textual mining and analysis due to the concise nature of tweets that users share with their respective followers.
The concept of infodemiology began to crop up in the early 2000s when several different researchers began playing with the idea of using Internet health-related searches to provide epidemiological data that could be used to inform public health. In 2004, Eysenbach became interested in tracking flu-related searches using online syndromic surveillance. Historically, syndromic surveillance systems have typically relied on data from patient encounters with health professionals. But what if we could track the health concerns of citizens before they ever see a physician? It turns out we can. The study of infodemiology on the World Wide Web has the potential to provide automatic, continuous, and virtually real-time snapshots of public opinion and behavioural trends. In the context of public health this means capturing public health concerns at their earliest stages and even predicting major influenza pandemics weeks before they happen.
In 2006 Eysenbach published findings from a rather clever experiment he had completed over the 2004/2005 flu season. In order to track the number of people in Canada that were searching for either “flu” or “flu symptoms” he created an ad “campaign” through the keyword-triggered advertising program Google AdSense. For the purposes of optics the flu keywords led searchers to an advertisement that linked them to a generic patient education website after a click. Eysenbach then gathered FluWatch data, including influenza cases, positive lab test results, and influenza-like illness reported by sentinel physicians (“ILI-SPR”) around the country, and correlated these disease surveillance metrics with his Google advertisement data. Incredibly, Eysenbach found that clicks on the flu advertisement he had created correlated more strongly and in a more timely fashion (statistically significant on both accounts) with influenza cases and positive lab test results than did the ILI-SPR data. In a nutshell, his online experiment was more accurate at predicting rises in influenza cases than was the nation-wide sentinel physician program.
Three years later in 2009 a research paper funded by Google was published in Nature describing an influenza surveillance system piggy-backing on the popularity of certain Google search queries. The model underlying the surveillance system was generated by processing hundreds of billions of previous individual Google searches stored in web search logs. (Despite Eysenbach’s earlier work being cited in the Google paper, most journalists failed to recognize that Google’s idea wasn’t entirely new). Today Google Flu Trends can be accessed online to follow worldwide estimates of influenza-like activity. Google may have to go back to the drawing boards and refine its model however, as a new study published this past summer shrouds the accuracy of the system in some doubt. Researchers at the University of Washington evaluated Google Flu Trends against the gold standard positive influenza virus infection and its accuracy came up short – about 25% short.
The new H1N1 Twitter study reaffirms Eysenbach’s status as a visionary in the field of infodemiology. With “Web 2.0” upon us, and tsunamis of user-generated content flooding the web, the Internet “has made measurable what was previously immeasurable” in Eysenbach’s words. What we could not measure 10 years ago due to the (comparatively) static nature of the Internet, is now readily measurable with infoveillance tools. In the context of H1N1 Eysenbach says:
“H1N1 marks the first instance in which a global pandemic has occurred in the age of Web 2.0 and presents a unique opportunity to investigate the potential role of these technologies in public health emergencies.”
To carry out analysis of tweet content in the H1N1 study Chew and Eysenbach used an open-source infoveillance system known as Infovigil (Eysenbach’s own creation) that automatically and continuously dissects textual information from Twitter. They created a “codebook” with three primary variables: 1) tweet content, 2) mode of expression, and 3) type of link posted, if any. Each of these categories had several subcategories allowing for good separation of different tweet “types”. The study had some interesting findings. Over the duration of the study the relative proportion of tweets using “H1N1” increased from 8.8% to 40.5%, indicating that the public gradually began to adopt the WHO-recommended terminology as opposed to “swine flu”. With respect to tweet content, personal accounts of H1N1 increased over time while humorous content declined, indicating that the public’s perception of the subject became more serious. The public attention was aroused in certain instances, especially following the WHO pandemic level 6 announcement on June 11, 2009, which gave rise to a large spike in tweets. Only 4.5% of tweets were identified as misinformation.
Overall the study is a nice proof of concept and displays the fact that Twitter is a rich source of public opinion for the health authorities. Infoveillance can be used in the future not only for capturing sentiment, experiences, and behavioural trends, but importantly for tracking misinformation and identifying the informational needs of the human population. More studies of this kind should elucidate the value that social media will have for knowledge translation research and help refine the precision and accuracy of infoveillance tools for future infodemiology studies.