Methods and Applications: Epidemic Surveillance of Influenza Infections: A Network-Free Strategy — Hong Kong Special Administrative Region, China, 2008–2011
The ease of coronavirus disease 2019 (COVID-19) non-pharmacological interventions and the increased susceptibility during the past COVID-19 pandemic could be a precursor for the resurgence of influenza, potentially leading to a severe outbreak in the winter of 2022 and future seasons. The recent increased availability of data on Electronic Health Records (EHR) in public health systems, offers new opportunities to monitor individuals to mitigate outbreaks.
Methods
We introduced a new methodology to rank individuals for surveillance in temporal networks, which was more practical than the static networks. By targeting previously infected nodes, this method used readily available EHR data instead of the contact-network structure.
Results
We validated this method qualitatively in a real-world cohort study and evaluated our approach quantitatively by comparing it to other surveillance methods on three temporal and empirical networks. We found that, despite not explicitly exploiting the contacts’ network structure, it remained the best or close to the best strategy. We related the performance of the method to the public health goals, the reproduction number of the disease, and the underlying temporal-network structure (e.g., burstiness).
Discussion
The proposed strategy of using historical records for sentinel surveillance selection can be taken as a practical and robust alternative without the knowledge of individual contact behaviors for public health policymakers.
Funding: Supported by Key Projects of Intergovernmental International Scientific and Technological Innovation Cooperation of National Key R&D Programs (No. 2022YFE0112300) and AIR@InnoHK administered by Innovation and Technology Commission of the Research Grants Council of the Hong Kong SAR Government
Author Affiliations
1.
WHO Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
2.
Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science and Technology Park, Hong Kong SAR, China
3.
Department of Genetics, University of Cambridge, Cambridge, UK
4.
Department of Computer Science, Aalto University, Espoo, Finland
Lee S, Rocha LEC, Liljeros F, Holme P. Exploiting temporal network structures of human interaction to effectively immunize populations. PLoS One 2012;7(5):e36439. http://dx.doi.org/10.1371/journal.pone.0036439CrossRef
Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the effect of individual variation on disease emergence. Nature 2005;438(7066):355-9. http://dx.doi.org/10.1038/nature04153CrossRef
Du ZW, Bai Y, Wang L, Herrera-Diestra JL, Yuan ZL, Guo RZ, et al. Optimizing COVID-19 surveillance using historical electronic health records of influenza infections. PNAS Nexus 2022;1(2):pgac038. http://dx.doi.org/10.1093/pnasnexus/pgac038CrossRef
[7]
Tsang TK, Perera RAPM, Fang VJ, Wong JY, Shiu EY, So HC, et al. Reconstructing antibody dynamics to estimate the risk of influenza virus infection. Nat Commun 2022;13(1):1557. http://dx.doi.org/10.1038/s41467-022-29310-8CrossRef
Liang J, Li Y, Zhang ZA, Shen DX, Xu J, Zheng X, et al. Adoption of Electronic Health Records (EHRs) in China during the past 10 years: consecutive survey data analysis and comparison of Sino-American challenges and experiences. J Med Internet Res 2021;23(2):e24813. http://dx.doi.org/10.2196/24813CrossRef
FIGURE 1.
A schematic of the proposed surveillance strategy to target previously infected nodes (History). (A) Infection probability of influenza virus in Hong Kong Special Administrative Region. (B) Informed by available historical observations of individuals (a, b, c, and d) over two seasons from S1 to S2, each for one year. (C) In our proposed surveillance strategy, individuals are ranked by the infection time in season S1 for season S2.
Note: In panel A, we studied a cohort of 956 participants from 2008 to 2011 with annual electronic records for three years (2008 to 2010, and 2009 to 2011) (7). We estimated the probability of a case being infected in the third year, which was infected or not in the past two years (Supplementary Table S1). Vertical bars and error bars represent the estimated mean and 95% CIs. In panel B, the average historical vulnerability of an individual is estimated from the historical infection time. In panel C, the black bars denote the observed infection timing of individuals in the first and second/third historical seasons.
Abbreviation: S1=the first season; S2=the second season.
FIGURE 2.
A simplified schematic illustration of the proposed surveillance strategy.
Note: The proposed strategy ranks the included individuals following four guidelines. Individuals who were uninfected in the current season but infected in the last season have higher rankings than others. And individuals can get a higher ranking if they were infected earlier in the last season and have more infection records in previous seasons. By taking four individuals (a, b, c, and d) as an example, users (e.g., doctors in hospitals) can assess their electronic records of historical seasons at the end of the current season. Individuals a and c have one infection record in the last season, none in the current season, and are ranked the highest. Given that a has been infected earlier than c, a has a higher ranking than c. If a and c have the same infection times in the last season, we could compare the number of infection records in previous seasons. Finally, the ranking of the four individuals is a, c, d, and b. We used golden/silver/copper crowns and red stars to mark their ranking from high to low.
FIGURE 3.
Schematic illustration of surveillance strategies. (A) The example of a temporal network with two time phases. (B) The schematic illustration of the Recent surveillance strategy. (C) The schematic illustration of the Frequent surveillance strategy. (D) The schematic illustration of the Random surveillance strategy. (E) The schematic illustration of the history surveillance strategy for previously infected (PI). (F) Surveillance objectives.
Note: In panel A, the first time phase is for training and the second one is for epidemic simulation. In panel B, C, D, and E, the horizontal line denotes an individual and the circles and vertical lines indicate the interaction. We marked the first infected node as “Seed” in the second time window, the node selected randomly to trigger surveillance strategy as “Random”, the node for sentinel surveillance as “Monitor” following different strategies. In epidemic simulation, gray and black circles with red borders denote infectors and infectees, respectively. In panel F, we compared the prevalences between nodes in the surveillance subset and those in the whole population. We calculated the time lag between the two groups reaching 1% prevalence (early warning) and their epidemic peaks (peak lag).
FIGURE 4.
Early warning and Peak lag of the random, recent, frequent, and history strategies. (A) Early warning in the Prostitution network. (B) Peak lag in the Prostitution network. (C) Early warning in the Email network. (D) Peak lag in the Email network. (E) Early warning in the Dating network. (F) Peak lag in the Dating network.
Note: The history strategy here uses the EHR records obtained from historical seasons two years ago. The horizontal and vertical axes present the early warning (days) in the left panels and Peak lags (days) in the right panels for each strategy over effective reproduction numbers (Re). Bars and error bars indicate the mean and standard deviations across 100 simulations of each temporal network. The burstiness of Prostitution, Email, and Dating are 0.39, 0.62, 0.72 (Supplementary Table S2).
The ease of coronavirus disease 2019 (COVID-19) non-pharmacological interventions and the increased susceptibility during the past COVID-19 pandemic could be a precursor for the resurgence of influenza, potentially leading to a severe outbreak in the winter of 2022 and future seasons. The recent increased availability of data on Electronic Health Records (EHR) in public health systems, offers new opportunities to monitor individuals to mitigate outbreaks.
Methods
We introduced a new methodology to rank individuals for surveillance in temporal networks, which was more practical than the static networks. By targeting previously infected nodes, this method used readily available EHR data instead of the contact-network structure.
Results
We validated this method qualitatively in a real-world cohort study and evaluated our approach quantitatively by comparing it to other surveillance methods on three temporal and empirical networks. We found that, despite not explicitly exploiting the contacts’ network structure, it remained the best or close to the best strategy. We related the performance of the method to the public health goals, the reproduction number of the disease, and the underlying temporal-network structure (e.g., burstiness).
Discussion
The proposed strategy of using historical records for sentinel surveillance selection can be taken as a practical and robust alternative without the knowledge of individual contact behaviors for public health policymakers.
Author Affiliations
1. WHO Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
2. Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science and Technology Park, Hong Kong SAR, China
3. Department of Genetics, University of Cambridge, Cambridge, UK
4. Department of Computer Science, Aalto University, Espoo, Finland
Funding: Supported by Key Projects of Intergovernmental International Scientific and Technological Innovation Cooperation of National Key R&D Programs (No. 2022YFE0112300) and AIR@InnoHK administered by Innovation and Technology Commission of the Research Grants Council of the Hong Kong SAR Government
Influenza infections were reported to be low between the months of September 2021 and January 2022 (1). The relaxation of coronavirus disease 2019 (COVID-19) non-pharmacological interventions and the increased susceptibility during the past COVID-19 pandemic have provided an opportunity for an increase of more severe influenza epidemics to occur in upcoming winters in temperate locations.
Infectious disease surveillance systems would provide historical information on the occurrence of infections and allow early detection of influenza outbreaks before they are past the point of being contained. The surveillance strategies that map out individual contact behaviors fall into two general categories — those based on static contact networks and those on temporal contact networks. Due to contact networks being essentially dynamic with temporal-network structures [such as burstiness — individual activities often happen in periods of intense activity (2)], the problem remains somewhat more practical in the context of temporal contact networks.
Retrospective studies have demonstrated that temporal network structures can influence the spreading speed and the outbreak size but also surveillance strategies (3–5). Cowling et al. compared two temporal-network strategies to select sentinels [sampling the recent contact, as the recent strategy, and most frequent contact, as the frequent strategy, with random individuals (2)], as well as two static-network strategies (acquaintance and random), on temporal networks for sentinel surveillance of outbreak detection (3). The two temporal-network strategies both derive earlier signals than static-network strategies for early epidemic detection on networks with strong temporal structures. However, due to physical contact data being difficult to obtain, these strategies are difficult to be applied to practical public health systems.
To detect an early signal for the emerging outbreak using sentinel surveillance, the digital data on Electronic Health Records (EHR) provide a unique chance to test cutting-edge sentinel surveillance strategies. The EHR of influenza viruses can help detect other viruses, and have temporal characteristics with records of when individuals were infected. Our previous study found treatment records can be used to monitor emerging epidemic outbreaks (e.g., influenza) and proposed a simple EHR-based strategy that identifies the most vulnerable individuals who acquired the earliest infections during historical influenza seasons and could be a theoretically optimal surveillance subset (6). However, it does not account for the real-world data validation, the temporal contact networks (in which the contact structure may not be persistent enough), and the cross-strain immunity (which could be gained during an influenza season to protect the previously infected individuals from the reinfection of a group of strains).
In the current study, we produced a practical data-driven surveillance strategy by targeting previously infected nodes with low cross-strain immunity to accelerate outbreak detection using sentinel surveillance of previous earliest infected individuals. We validated this strategy with a real-world cohort study and further validated it by simulations using mathematical epidemic models in temporal networks. We quantified the early warning and Peak lag gained by these selected individuals over different transmission scenarios of effective reproduction numbers, Res.
Lee S, Rocha LEC, Liljeros F, Holme P. Exploiting temporal network structures of human interaction to effectively immunize populations. PLoS One 2012;7(5):e36439. http://dx.doi.org/10.1371/journal.pone.0036439.
[3]
Bai Y, Yang B, Lin LJ, Herrera JL, Du ZW, Holme P. Optimizing sentinel surveillance in temporal network epidemiology. Sci Rep 2017;7(1):4804. http://dx.doi.org/10.1038/s41598-017-03868-6.
[4]
Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the effect of individual variation on disease emergence. Nature 2005;438(7066):355-9. http://dx.doi.org/10.1038/nature04153.
[5]
Gao C, Zhu JY, Zhang F, Wang Z, Li XL. A novel representation learning for dynamic graphs based on graph convolutional networks. IEEE Trans Cybern 2022. http://dx.doi.org/10.1109/TCYB.2022.3159661.
[6]
Du ZW, Bai Y, Wang L, Herrera-Diestra JL, Yuan ZL, Guo RZ, et al. Optimizing COVID-19 surveillance using historical electronic health records of influenza infections. PNAS Nexus 2022;1(2):pgac038. http://dx.doi.org/10.1093/pnasnexus/pgac038.
[7]
Tsang TK, Perera RAPM, Fang VJ, Wong JY, Shiu EY, So HC, et al. Reconstructing antibody dynamics to estimate the risk of influenza virus infection. Nat Commun 2022;13(1):1557. http://dx.doi.org/10.1038/s41467-022-29310-8.
Liang J, Li Y, Zhang ZA, Shen DX, Xu J, Zheng X, et al. Adoption of Electronic Health Records (EHRs) in China during the past 10 years: consecutive survey data analysis and comparison of Sino-American challenges and experiences. J Med Internet Res 2021;23(2):e24813. http://dx.doi.org/10.2196/24813.
Figure 1. A schematic of the proposed surveillance strategy to target previously infected nodes (History). (A) Infection probability of influenza virus in Hong Kong Special Administrative Region. (B) Informed by available historical observations of individuals (a, b, c, and d) over two seasons from S1 to S2, each for one year. (C) In our proposed surveillance strategy, individuals are ranked by the infection time in season S1 for season S2.
Figure 2. A simplified schematic illustration of the proposed surveillance strategy.
Figure 3. Schematic illustration of surveillance strategies. (A) The example of a temporal network with two time phases. (B) The schematic illustration of the Recent surveillance strategy. (C) The schematic illustration of the Frequent surveillance strategy. (D) The schematic illustration of the Random surveillance strategy. (E) The schematic illustration of the history surveillance strategy for previously infected (PI). (F) Surveillance objectives.
Figure 4. Early warning and Peak lag of the random, recent, frequent, and history strategies. (A) Early warning in the Prostitution network. (B) Peak lag in the Prostitution network. (C) Early warning in the Email network. (D) Peak lag in the Email network. (E) Early warning in the Dating network. (F) Peak lag in the Dating network.