Federated Machine Learning for COVID19 Spreading Estimation

COVID19 spreading estimation is currently based on simulations[1] that are driven by known epidemiological models[2]. The latter are adapted based on classical models used for viruses of the corona family[3]. Specially, the mathematical principles of the spreading models try to provide estimates about the so-called SIR parameters[4] i.e., the speed of how individuals are either susceptible (S) to the virus, have become infected (I) and then recover (R) or die. The R group is presumed to be immune to the virus and hence it can no longer pass on the infection. SIR models are based on some baseline assumptions, such as that everyone has the same chance of catching the virus from an infected person because the population is perfectly and evenly mixed, and that people with the disease are all equally infectious until they die or recover. Direct extensions to the SIR models cluster people into more fine-grained groups based on demographic, healthcare and social parameters such as their age, sex, health status, employment, number of contacts, as well as social mixing parameters (e.g., who meets whom and when).

With appropriate spreading models at hand, statistical models (including machine learning techniques) can be applied to predict and anticipate the spreading of the disease

During the COVID19 outbreak, scientists have developed models that consider alternative data sources, including for example spreading information exchanged in news and social media[5] . Likewise, the impact of state measures against the pandemic in some countries is also considered in the development of predictive models for the spreading of the disease[6] . The measures that are commonly considered include social distancing, isolation, lock-down policies, and travel restrictions [7]. With appropriate spreading models at hand, statistical models (including machine learning techniques) can be applied to predict and anticipate the spreading of the disease[8] .
Federated Machine Learning (FML) techniques can provide novel privacy-preserving ways for estimating the spreading of the COVID19 disease within organizations or even entire regions. Specifically, FML techniques enable aggregation of individual information and features provided by the healthcare organizations or individuals (e.g., citizens using symptom trackers) towards building optimized models for the spreading of the disease. Hence, FML techniques enable increased accuracy of estimates (i.e. global models that are more accurate than the individual models of single healthcare organizations). At the same time, they preserve the privacy of the organizations and individuals, as their data (e.g., individual symptoms, traced contacts, learnt models) will not be shared centrally. This is in-line with common learning models and practices of the World Health Organization in other areas[9] .


[1] David Adam, “Special report: The simulations driving the world’s response to COVID-19”, https://www.nature.com/articles/d41586-020-01003-6

[2] Colizza V, Barrat A, Barthélemy M, Vespignani A. Predictability and epidemic pathways in global outbreaks of infectious diseases: The SARS case study. BMC Med. 2007;5(1):34

[3] Ferguson, N. M. et al. Preprint at Spiral https://doi.org/10.25561/77482 (2020).

[4] Read JM, Bridgen JRE, Cummings DAT, Ho A, Jewell CP. Novel coronavirus COVID-19: Early estimation of epidemiological parameters and epidemic predictions. medRxiv. 2020.

[5] Emily Waltz, “How Computer Scientists Are Trying to Predict the Coronavirus’s Next Moves”, interview with Alessandro Vespignani, IEEE Spectrum, February 2020.

[6] M.K., Arti. (2020). Modeling and Predictions for COVID 19 Spread in India. 10.13140/RG.2.2.11427.81444.

[7] Bhatnagar, Manav. (2020). COVID-19: Mathematical Modeling and Predictions. 10.13140/RG.2.2.29541.96488.

[8] Huang P. Research and Implementation of Prediction Model for Class B Infectious Diseases Based on Machine Learning [D]. University of Electronic Science and Technology, 2019. [1] https://venturebeat.com/2020/04/15/healthcare-organizations-use-nvidias-clara-federated-learning-to-improve-mammogram-analysis-ai/

Related articles

Dealing with Missing Data in Machine Learning Problems

Three popular techniques for dealing with missing values in data science and Machine Learning: Dropping the Columns with Missing Values: Simplest option, but unless a column is mostly missing it can lead to loss of […]

Learn More

Popular Techniques for Explainable Artificial Intelligence

During the last couple of years, the interpretability and explainability of AI systems are gaining momentum as a result of the need to ensure the transparency and accountability of AI operations, while at the same […]

Learn More