Can AI spot the next pandemic before it starts?

Zoonosis, as defined by the World Health Organization (WHO), can transmit infectious diseases from wild animals to humans – diseases that can be deadly enough to cause pandemics. The principal reasons for transmission are the increasing human encroachment on animal habitats, changing climate, and the increased movement of people, animals, and animal products because of international trade. The Global Virome Project estimates around 1.7 million animal viruses are known to cause infections in birds and mammals. Scientists believe almost half of those viruses can spread to humans. Understanding these has now become key to preventing pandemics or at least being better prepared in case one hits us.

Research of this kind is an unbelievably huge task and has led to the development of a new discipline where machine learning (ML) and statistical models are used to predict the emergence of various diseases, likely animal hosts, geographical hotspots, and which viruses are most likely to affect humans. Scientists who support this technology firmly believe that the findings will guide the development of medicines and vaccines, and help everyone involved to study, observe, and predict situations accurately.

Naturally, all researchers do not agree with this approach. Many do not believe predictive technology can keep up with the frequently changing virome or the scale of what exists at any given point. True, there is a constant improvement of data and  artificial intelligence (AI) models, but for such tools to be truly predictive of future pandemics, the efforts must include a very wide network of researchers spread across the globe.

AI spotted the first signs of Covid-19

Canada-based Bluedot was one of the first organisations to recognise the emergence of the Covid-19 pandemic and sound the alarm. It uses an AI-based algorithm that continuously searches global data to pinpoint the next outbreak of an infectious disease. HealthMap, the algorithm run at the Boston Children’s Hospital, also caught these first signs of Covid-19. So was the case with Mayo Clinic’s Coronavirus Map Tracking Tool.

Rapidly developing Natural-language processing (NLP) algorithms monitor global healthcare reports and news outlets in various languages, and flag mentions of diseases such as Covid-19, or endemic ones such as tuberculosis or HIV. Air-travel data is also monitored to assess the risks of spreading. Social media was found to be quite a reliable source of data during Covid-19. Data scientists at the University of Colorado, Boulder, used ML and a short-term forecasting model to analyse large datasets gathered from popular online platforms and compared the results to insights obtained from analysing the more conventional mobile device location data. As people travelled during the pandemic, or recovered and talked about their Covid experience, the technology recognised specific keywords and gathered relevant data. In 2021, when mask-wearing policies, lockdowns, and travel restrictions kept changing, this model was found to be far closer to reality than other models.

Early buzz but not much later

However, after that initial buzz, AI could not do much. Almost all AI models were weak when applied to real-world clinical settings. Deep-learning models, when applied to CT scans and chest x-rays, were found to be unsuccessful. Perhaps a major reason for these failures was that the AI models were working on a real pandemic for the first time. Four areas were identified as the primary roadblocks: imperfect datasets, human failures, automated discrimination, and complex global conditions.

Researchers are hopeful that the disappointments during Covid-19 will pave the way to generate better and sturdier AI models.

Imperfect datasets

For AI models to predict the emergence of a pandemic, the primary requirement is large amounts of reliable datasets, which is not always easy to gather. Different institutes across different countries have varying policies about sharing healthcare data. Further, individuals may not want their data shared either, even anonymously. For all the aspects to come together, leaders in healthcare, government, and businesses must be on the same page about privacy issues. AI models are only as good as the data they work on. So, given the many barriers to collecting good datasets, the predictions during Covid-19 were understandably below par.

Human errors

Data entry errors because of tremendous pressure, insufficient manpower, hurry to reach conclusions, and wrong incentives – everything was at play during Covid-19. The inability of people in charge to interpret data and AI predictions correctly was another common error.

Automated discrimination

Predictions and decisions regarding treatment taken by healthcare authorities also affected recovery rates despite the availability of AI models. Many disadvantaged groups did not receive appropriate or poor treatment because of AI biases, which can again be traced to human biases.

Complex global conditions

As mentioned earlier, data sharing is governed by different rules in different countries and that affects the quality and quantity of reliable data available. During the pandemic, there were many discussions and debates about sharing genome sequences across countries. Populations of different countries also reacted differently to the idea of sharing health data.

Way forward

Prediction, diagnosis, and treatment are the three areas where AI can be best used. For AI models to learn and predict efficiently, a few factors must be corrected over time.

  1. Find better healthcare datasets, preferably in standardised formats, to create a centralised repository of data. Consider using synthetic data instead of real data to bypass the principles of privacy. New data processing techniques must be developed.
  2. Ensure greater diversity in the data collected. This will prevent automated discrimination and underrepresentation of disadvantaged groups.
  3. Promote greater cooperation across teams such as AI teams, researchers, clinicians, engineers, and even ethicists to ensure AI systems are aligned with the existing value systems.
  4. International data sharing rules must be outlined to facilitate data sharing without breaking any privacy rules. AI teams must be trained to recognise differences in data gathered from different regions of the world.

The fast spread of Covid globally and the struggle faced by health services to stay ahead of the disease pushes home the need to use the best AI models available to track and predict pandemics. AI developers must continue working on predictive models to ensure that the next pandemic, if any, can be preemptively predicted and contained at the right time.

This blog was first published on Business Insider

Recent Posts