AI Predicts Wuhan Epidemic: How Can Startups Conquer AI-Based Epidemic Forecasting?

baoshi.rao

In 2020, the COVID-19 outbreak, which caused widespread panic, was actually predicted in advance by a startup using an AI monitoring platform. Many may wonder: How was this prediction made, and how can the underlying technology be further utilized? This article offers some insights and reflections.

Predicting the unknown has always been a coveted human ability.

From ancient Chinese divination methods like the I Ching and the Tang Dynasty's "Tui Bei Tu" to Western astrology and the medieval popularity of tarot cards, humanity has long sought ways to foresee the future.

More recently, the global frenzy and commercial hype surrounding the Mayan prophecy of the "2012 apocalypse" remain fresh in our memories.

Today, the era of "seeking answers from the supernatural rather than the people" has passed. We have become adept at making deterministic, empirical, and even probabilistic predictions about the physical world and socio-economic trends. However, when it comes to highly complex, multi-variable, and large-scale predictions—such as those described by the "butterfly effect"—are humans still powerless?

The answer is no.

Recently, the outbreak of the novel coronavirus in Wuhan, China, drew intense attention from the World Health Organization and global health agencies. Among the reports, Wired magazine highlighted how a Canadian company, BlueDot, used its AI monitoring platform to predict and announce the infectious disease outbreak in Wuhan ahead of others, garnering widespread media coverage in China.

This seems like the kind of achievement we most desire in "predicting the future"—leveraging big data and AI inference, humans appear to decipher "divine will," uncovering causal patterns hidden in chaos to potentially avert disasters before they strike.

Today, we explore how AI has progressed toward "divine foresight" by examining its role in epidemic prediction.

Using AI to predict infectious diseases is not BlueDot's exclusive innovation. In fact, as early as 2008, Google, now a powerhouse in AI, made a less successful attempt.

In 2008, Google launched a system to predict flu trends—Google Flu Trends (GFT).

GFT gained fame weeks before the 2009 H1N1 outbreak in the U.S., when Google engineers published a paper in Nature. Using vast amounts of search data, they successfully predicted the spread of H1N1 across the country.

For flu trends and regional analysis, Google processed billions of search queries and 450 million unique numerical models to create a flu prediction index. The results showed a 97% correlation with official data from the U.S. Centers for Disease Control and Prevention (CDC)—but were two weeks ahead.

In epidemics, time is life, and speed is wealth. If GFT could maintain such "precognitive" abilities, it would undoubtedly give society a head start in controlling outbreaks. However, the prophecy didn’t last long. In 2014, GFT made headlines again—but this time for its poor performance.

Researchers published an article in Science titled "The Parable of Google Flu: Traps in Big Data Analysis," pointing out that GFT failed to predict the non-seasonal A-H1N1 flu in 2009. From August 2011 to August 2013, GFT overestimated flu incidence rates in 100 out of 108 weeks compared to CDC reports. By how much?

In the 2011-2012 season, GFT’s predictions were over 1.5 times higher than CDC reports. By the 2012-2013 season, they were more than double.

(Chart from "The Parable of Google Flu: Traps in Big Data Analysis," Science, 2014) Although GFT adjusted its algorithm in 2013, blaming media coverage for altering search behavior, its 2013-2014 predictions still exceeded CDC reports by 1.3 times. The systemic errors identified earlier persisted—the "boy who cried wolf" problem remained.

What factors did GFT overlook, leading to its downfall?

Researchers identified several issues with GFT’s big data analysis:

Big Data Arrogance: Google engineers assumed that search query data could fully replace traditional data collection (sampling statistics), ignoring the need for complementary methods. GFT treated search data as perfectly correlated with the actual flu-affected population, overlooking that volume doesn’t guarantee accuracy or comprehensiveness. This led to failures in predicting new data patterns post-2009.
Lack of Expert Input: GFT didn’t incorporate professional health data or expert insights, nor did it clean or filter search data, resulting in inflated and unresolved overestimations.
Search Engine Dynamics: After 2011, Google introduced "related search suggestions," which artificially boosted certain queries. For example, searching "sore throat" might prompt recommendations like "sore throat and fever" or "how to treat a sore throat," leading users to click out of curiosity. This distorted the accuracy of GFT’s data.
Media Influence: Media coverage of flu outbreaks increased related searches, further skewing GFT’s predictions. This created a feedback loop where predictions influenced behavior, which in turn reinforced predictions—a "prediction-interference paradox."
Correlation vs. Causation: GFT focused on statistical correlations between search terms and flu spread without understanding causal links. For instance, a spike in "flu" searches might stem from a movie release, not an actual outbreak.

Despite calls for transparency, Google never disclosed GFT’s algorithm, raising doubts about reproducibility and commercial motives. Researchers advocated combining big data with traditional statistics for more accurate behavioral studies.

Google ignored these suggestions, and GFT was shut down in 2015. However, it continued collecting search data for the CDC and research institutions.

Notably, Google was already investing in AI, acquiring DeepMind in 2014, but chose not to integrate AI into GFT. Around the same time, BlueDot emerged—the company we see today.

BlueDot is an automated epidemic surveillance system founded by infectious disease expert Kamran Khan. It tracks outbreaks of over 100 infectious diseases by analyzing approximately 100,000 articles daily in 65 languages. This targeted data collection helps identify potential epidemic outbreaks and their spread patterns.

BlueDot employs natural language processing (NLP) and machine learning (ML) to train its "automated disease surveillance platform," enabling it to filter out irrelevant "noise" in the data. For instance, the system can differentiate between an actual anthrax outbreak in Mongolia and a reunion of the heavy metal band "Anthrax" formed in 1981.

Unlike Google Flu Trends (GFT), which often misinterprets flu-related searches as actual cases—leading to overestimations—BlueDot excels in discerning critical data. During the COVID-19 outbreak, BlueDot identified the epidemic's origin by scanning foreign news reports, animal and plant disease networks, and official announcements, while avoiding social media due to its noisy data.

For predicting transmission routes, BlueDot relies on global flight ticket data to track infected individuals' movements. In early January, it accurately predicted COVID-19's spread from Wuhan to Beijing, Bangkok, Seoul, and Taipei within days. This wasn't BlueDot's first success: in 2016, its AI model predicted the Zika virus's arrival in Florida six months in advance.

BlueDot's approach differs fundamentally from GFT's. While GFT uses regression analysis (e.g., linear regression, logistic regression) to fit historical data—often leading to overfitting—BlueDot combines medical expertise with AI and big data analytics. Its deep learning models, trained on high-quality, annotated data, continuously improve through feedback loops.

Crucially, BlueDot doesn't rely solely on AI. After data screening, human experts analyze the results, blending "correlational" big data (GFT's approach) with "expert-driven" insights. This hybrid model enhances accuracy, as epidemiologists validate AI-generated alerts before public release.

However, challenges remain: Could AI models overstate severity to avoid underreporting? Are excluded data sources (e.g., social media) limiting effectiveness? As a professional health platform, BlueDot prioritizes accuracy, as its credibility hinges on reliable predictions.

The system also faces balancing acts—commercial viability versus public responsibility, transparency versus data privacy. Yet, its early warnings for COVID-19 highlight AI's potential in global health crises. Future applications could include AI-driven pathogen identification, seasonal outbreak forecasting, and optimized medical resource allocation during epidemics.

The journey has just begun, but AI's role in epidemic prevention promises transformative possibilities.