top of page

Mathematical Modelling in the Service of Medical Intelligence: Predicting the Next Pandemic

  • Ioanna Maria Arvanitaki
  • Mar 31
  • 9 min read

Updated: Apr 19

By Ioanna Maria Arvanitaki, Year 12 Student


Introduction

For centuries humankind has faced threats from outbreaks of pandemics; the Plague of Justinian (541-542), the Black Death (1346-1353), the Cholera Pandemic(s), the HIV/AIDS Pandemic (at its peak between 2005-2012) and finally, the Coronavirus Pandemic (2020-) [1]


Infectious diseases which have not circulated among people for a sufficiently long time, thus resulting in little or no immunity, may cause a pandemic[2]. History has proved this phenomenon to be a great challenge to the global economy and destructive to individuals and communities. The prediction of future pandemics would help prepare us to reduce damage. The point of inquiry is whether “we will ever be able to predict the next pandemic”. 


In this essay, we will explore the main methods that exist today, their drawbacks, and parallel models that are used in other fields, and suggest a way forward.



The Significance of Mathematical Models

A prediction is a statement about what might happen in the future. The utilisation of mathematical simulation models plays a central role in the assumption of a prediction[3]. The exact time at which a model is anticipated to predict, in connection with the subject of the prediction, is known as the prediction time. For systems changing over time in a dynamic way, knowing the prediction time is crucial. In order to answer the aforementioned question, we must engage in analysing past infections.


A unique characteristic of the dispersal kinetics of infectious diseases is that the causative microbial agent can be transmitted from one organism to another, often using animals (i.e. insects) as vectors (carriers). This type of transmission may reveal recurrent patterns, which we can use in order to determine their predictability[4]


There are several obstacles that we face when creating an accurate model. The observed data of symptomatic cases of a particular infection may be insufficient in order to evaluate the trends. Moreover, the outcome may depend on the infection’s natural history and its previous level of exposure. Although for some infections exposure confers complete immunity, others may reappear by reinfection or recurrence[5]. There are also secondary factors that can influence the overall outcome of the disease, such as: seasonality, government interventions (i.e. isolation, modernised methods of immunity, etc.), false positive/negative cases, asymptomatic carrier, inconsistency of case definitions over time[6] and the different consequences a disease may have on a subgroup (i.e. young adults and the possibility of a long-covid syndrome). 


Despite the development of novel models trying to consider as many of the aforementioned variables and eliminate obstacles, there are still many inaccuracies that may be present. Some are due to the type of data collection (which is retrospective), and some are due to the nature of the infection and will continue to be present regardless of the method used. It is evident that the representation of data plays a primary role in developing an accurate model. Data must be true and complete, in order to represent a population. Without accurate data, there is no accurate prediction[7].


Epidemiologists and statisticians have used specific methods to create models for disease prediction. While statistical modelling is based either on fitting a curve on prior data or replicating a pattern, epidemiological modelling considers the type of microorganism, its growth and transmission, taking into account many of the secondary factors mentioned above. The main models that have been utilised in the prediction of a pandemic will be outlined below.



The Distributional Fitting

The distributional fitting method attempts to represent a problem through graphs using numerous statistical distributions. The number of cases tends to grow exponentially during the early stages of the pandemic, resulting in an increase in the force of infection and the number of reported cases per unit time. As microbial pathogens spread over time, the kinetics of dissemination directly depend on the epidemic’s current stage, the mean number of infections by one infected person, etc. In other words, the total number of infections is determined by the growth rate of the infection[8].


When it comes to graphical modelling, the behaviour of the curve(s) changes over time, hence different distributions being best fitted for different stages. It is impossible, therefore, to represent a single problem through a single distribution[9]. Datta et al., have shown that the Lognormal distribution is the one fitting the data best for all stages of the disease in almost all countries. It consists of different pieces of curves put together as a step function. 



Time-Series Regression Models

With time series regression modelling, a future response is based on the response history. The primary objective of time-series regression modelling is to meticulously collect historical data over a long period of time in order to construct the best model that can adequately explain the natural progression. The constructed model is then used to predict the variable’s future values[10]


Time series models have been developed by a number of researchers over many years to improve the disease's prediction precision, accuracy and efficacy. Auto-regressive time series models like AR (Auto Regressive), MA (Moving Average), ARMA (Auto Regressive Moving Average), ARIMA (Auto Regressive Integrated Moving Average), and SARIMA (Seasonal Auto-Regressive Integrated Moving Average) are the primary time series models that have been proposed by scientists[11]. These models have aided in predicting the likelihood of future phenomena, risks, and the spread or trend of various infectious diseases like Dengue Fever, Ebola, Influenza, and Malaria[12].


One of the drawbacks of the Time-Series Regression models is that they fail to include all the factors which influence the spread of the disease. They also face a number of possible calculation inaccuracies, which can be easily made and yet hard to detect. Such include the MSE (Mean Squared Error), MAD (Mean Absolute Deviation), RMSE (Root Mean Squared Error), etc.[11].



Epidemiological Modelling

Besides the infectious agent’s mode of transmission, the manner in which an infection spreads depends on external and environmental factors (social, cultural, demographic, economic, geographic, etc.). Based on characteristics like gender, age, and size, epidemiology distinguishes distinct populations while ignoring an individual's uniqueness. It determines whether the population's divisions into various groups reveal additional information that can be obtained from each individual. The goal is to describe, analyse, and comprehend the patterns of infectious diseases in these groups through epidemiological modelling[13].


The transmission dynamics are formulated using various epidemiological models. Such include SI (Susceptible and Infected), SIS (Susceptible, Infected and Susceptible), SIR (Susceptible, Infected and Recovered), SIRS (Susceptible, Infected, Recovered and Susceptible), SEIR (Susceptible, Exposed, Infected and Recovered), and SEIRS (SIR with untested/unreported cases) models[11].


Nonetheless, epidemiological modelling also faces a number of obstacles. For instance, the implications of novel interventions, especially when they are a consequence of the model’s prediction, are not considered. This is exemplified in the SI model, where cases are simulated without the concept of immunity or susceptibility. Herd Immunity and Incubation Period are only starting to be included in the most recent models.



Infectious Disease Modelling Using Artificial Intelligence (AI)

Trends can be identified and detected before they become significant through the detection of weak signals. The field of cybersecurity makes extensive use of this and has now been applied to predicting the clinical course of a disease in a specific population; by identifying a pattern in a small sample or a group of individuals. 


Taking the aforesaid a step further, AI modelling translates this into healthcare policies. When each of the various databases has a high level of veracity, the above high-level AI analytical approach is possible[14],[15]. CovidSim is an example created at Imperial College London, which tries to answer the question of the best strategy that would lead us to an acceptable outcome if complete suppression is impossible[16]. Different advisory groups, other than CovidSim (used by the Centre for Global Infectious Disease Analysis, GIDA), have been created. These include the Scientific Pandemic Influenza group on Modelling, SPI-M (used by the UK Department of Health), and WHO’s modelling groups which range from characterising the dynamics of an outbreak to investigating the effectiveness of different interventions. 


The model's predictive accuracy is frequently influenced by the quality and specificity of the input data rather than its quantity. The unpredictability of infectious diseases with numerous factors influencing them compounds the issue. The resulting strategies are also presented as precise absolute numbers of expected or avoided outcomes, which may potentially lead to uncritical reporting and/or interpretation[16]. Such deficiencies could limit the usability of models [14], [15], [17].



A Way Forward - Discussion

It must be noted that the idea of predicting future outcomes is not something new. It is standard government practice to utilise advisory groups to guide their decision making. On this basis, as proposed by numerous authors, a surveillance program must be developed based on the knowledge of disease epidemiology, using public-health analysis of already available data. 


Secondly, collaboration between different domains is strongly suggested and may contribute to the development of a more efficient model[18]. Similar models indeed have been developed and utilised in other fields too. University College London (UCL) has created a computer model enclosing social change with environmental disturbance (i.e., deforestation, urban expansion, migration of host animals, temperature change, expected outcomes of climate change, etc.). While this model has accurately predicted the emergence of three outbreaks, it was unable to predict the time of the outbreak. This model comprises the base of the Global Virome project, a plan to predict the world’s unknown viral threats. Furthermore, weather forecasting uses time regression series models in order to predict the weather in the near future[19]


Collaboration is truly vital. A possible suggestion is the cooperation between seismology and the epidemiology of infectious diseases. Seismological theories concerning the prediction of earthquakes could apply to the prediction of a pandemic. Usually, just as major earthquakes are followed by aftershocks (smaller earthquakes), there are similar clusters of earthquakes prior to the major ones, which act as earthquake precursors. This clustering of precursory earthquakes may vary in duration – from just a few months to a period of decades prior to the event. The larger the incoming earthquake, the larger the precursors will be, the longer the period and finally the larger area to be affected[20]. This precursory scale increase can be applied to infectious diseases. The higher frequency of endemics within a region may suggest the fall of herd immunity. This signals the vulnerability of the population, which may lead to the beginning of a new pandemic. Evidently, this might sound simple but it could lead us to a whole new area of research.



Conclusion

In this brief essay, we explored some basic concepts used to simulate an infectious disease creating a pandemic. While all the models that have been developed have succeeded in identifying and measuring a range of factors, there are still a lot more that are missing from the mathematical proofs. Improving data harvesting through surveillance programs, and borrowing ideas from other faculties is the suggested way forward. It appears that it is an uphill struggle to create a framework which will enable us to safely predict the next pandemic. Nonetheless, with scientific perseverance, through research and continuous experimentation, we should be able to limit our margins of error; and hence our vulnerability.




Bibliography

  1. “Outbreak: 10 of the Worst Pandemics in History.” MPH Online. https://www.mphonline.org/worst-pandemics-in-history

  2. “Outbreaks, Epidemics and Pandemics—What You Need to Know.” APIC, n.d. https://apic.org/monthly_alerts/outbreaks-epidemics-and-pandemics-what-you-need-to-know/

  3. Herring Jonathan. “Criminal Law: Text, Cases and Materials”. June 18. Fifth Edition, Oxford University Press, 2012.https://www.booksfree.org/wp-content/uploads/2022/03/Criminal_Law_JONATHAN_HERRING-5th-edition-booksfree.org.pdf

  4. Rana A, Ahmed M, Rub A, Akhter Y. A tug-of-war between the host and the pathogen generates strategic hotspots for the development of novel therapeutic interventions against infectious diseases. Virulence. (2015) 6: 566-80. doi:10.1080/21505594.2015.1062211

  5. Yadav SK, Subhash K, Akhter Y. “Statistical Modeling for the Prediction of Infectious Disease Dissemination With Special Reference to COVID-19 Spread.” Frontiers in Public Health 9 (June 16, 2021): 645405 https://www.frontiersin.org/articles/10.3389/fpubh.2021.645405/full

  6. World Health Organization. Infectious Diseases Fact Sheets. (2018) https://www.who.int/topics/infectious_diseases/factsheets/en

  7. Heymann, David, Ross, Emma, Wallace, Jon. ‘The next Pandemic – When Could It Be?’ Chatham House, The Royal Institute of International Affairs, 23 February 2022,https://www.chathamhouse.org/2022/02/next-pandemic-when-could-it-be

  8. Liang K. Mathematical model of infection kinetics and its analysis for COVID-19, SARS and MERS Infection, genetics and evolution. J Mole Epid Evolu Gen Infect Dis. (2020) 82: 104306. doi:10.1016/j.meegid.2020.104306

  9. Datta R, Trivedi P.K, Kumawat A, Kumar R, Bhardwaj I, Kumari N, Agiwal V, Kumar S, Kumar A, Shukla A, Kumar J. Statistical Modeling of COVID-19 Pandemic Stages Worldwide. Preprints. (2020).doi:10.20944/preprints202005.0319.v1

  10. Raicharoen T, Lursinsap C, Sanguanbhoki P. Application of critical support vector machine to time series prediction. In: Circuits and Systems, ISCAS’2003, Proceedings of the 2003 International Symposium (Bangkok). (2003)

  11. Hyndman Rob J, Athanasopoulos G. Forecasting: Principles and Practice. 2nd Edition, Monash University, https://otexts.com/fpp2/.IBM Documentation. https://www.ibm.com/docs/en/cognos-analytics/11.1.0topic=forecasting-statistical-details

  12. Zhang X, Liu Y, Yang M, Zhang T, Young AA, Li X. Comparative study of four time series methods in forecasting typhoid fever incidence in China. PLoS ONE. (2013) 8:e63116.doi:10.1371/journal.pone.0063116

  13. Hethcote HW. An immunization model for a heterogeneous population. Their Popul Biol. (1978) 14: 338-49. doi:10.1016/0040-5809(78)90011-4

  14. Agrebi S, Larbi S. “Use of Artificial Intelligence in Infectious Diseases.” Artificial Intelligence in Precision Health (2020), pp. 415–38. PubMed Central, https://doi.org/10.1016/B978-0-12-817133-2.00018-5

  15.  Wong, Zoie S. Y., et al. “Artificial Intelligence for Infectious Disease Big Data Analytics.” Infection, Disease & Health, 24(1): 44-48 (Feb. 2019) www.idhjournal.com.au, https://doi.org/10.1016/j.idh.2018.10.002

  16. Kurth, Tobias, Ralph Brinks. “Predicting the Pandemic”. BMJ 371 (October 2020) https://doi.org/10.1136/bmj.m3932

  17. “Can AI Predict the Next Pandemic? The Role of AI and Genomics in Combating Infectious Diseases.” AI for Good (30 July 2021). https://aiforgood.itu.int/event/can-artificial-intelligence-predict-the-next-pandemic/

  18. Vidal, John. ‘'Why Wait for It?' How to Predict a Pandemic’. The Guardian. https://www.theguardian.com/global-development/2020/sep/16/how-to-predict-a-pandemic-worlds-most-dangerous-viruses.

  19. O’Connor, Errin. ‘Time-Series Forecasting - Advanced Statistical Analysis Of The Data’. EPC Group, 17 May 2022, https://www.epcgroup.net/time-series-forecasting-advanced-statistical-analysis-of-the-data/.

  20. Rhoades D., Christophersen A., 2019, "Earthquake forecasting: Small earthquakes show when big ones are more likely" Research OUTREACH (111) https://researchoutreach.org/articles/earthquake-forecasting-small-earthquakes-show-when-big-ones-are-more-likely


Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.
bottom of page