-
The reproduction number (
$ R $ ) serves as a fundamental metric in the examination of infectious disease outbreaks, epidemics, and pandemics. Despite an array of available methods for estimating$ R $ , both newcomers and established public health professionals often encounter difficulties in comprehending the circumstances for their use and their constrictions. Consequently, this review intends to offer elementary guidance on$ R $ ’s selection and estimation approaches. To facilitate our review, we executed an extensive search on PubMed and Web of Science applying the following search approach: [“Basic Reproduction Number/classification”(Mesh)] AND [“Basic Reproduction Number/prevention and control”(Mesh)] OR [“Basic Reproduction Number/statistics and numerical data”(Mesh)]. Our search parameters were restricted to articles published from January 2013 to January 2023. This search rendered a total of 7,094 articles, of which we selected 60 that met our inclusion standards for further analysis.
HTML
-
The direct method is used to estimate
$ R $ by analyzing a clear transmission chain multiplying the$ \beta $ with the transmission probability per contact ($ p $ ), contact rate ($ c $ ), and infectious period ($ D $ ) (11,22):$$ \beta =pc $$ $$ {R}_{0}=\beta D=pcD $$ The direct method is applicable to distinct scenarios that involve a minimal number of case generations within a brief time frame, or small sample sizes during the early phase of an epidemic or outbreak. This allows researchers the potential to separately calculate
$ R $ for each possible transmission chain, analyze the distribution of$ R $ , and evaluate the contributions of different transmission chains to the spread of the disease. However, the direct method might be prone to bias resulting from small sample sizes and is subject to limitations related to the lack of time variation. Moreover, challenges regarding underreporting and fragmented data in real-time evaluations present potential issues (23).
Implementation of the Direct Method:
-
The definition-based method (DBM) is an indirect approach used to estimate the
$ R $ value. This method is applied to various transmission dynamics models, including the Susceptible-Infectious-Recovered (SIR) model, the Susceptible-Exposed-Infectious-Recovered (SEIR) model, the Susceptible-Infectious-Recovered-Cross immune (SIRC) model, and the Susceptible-Infectious-Recovered-Susceptible (SEIS) model (24-28). Taking the SIR model as an example:$$ \frac{dS}{dt}={b}_{r}N-\frac{\beta SI}{N}-{d}_{r}S $$ $$ \frac{dI}{dt}=\frac{\beta SI}{N}-\gamma I-{d}_{r}I $$ $$ \frac{dR}{dt}=\gamma I-{d}_{r}R $$ The secondary infections generated by an infected individual per unit of time are represented as
$ \beta S/ N $ , which corresponds to the inflow process. On the other hand, the recovery or natural death of an infected individual per unit of time is denoted as$ \gamma +{d}_{r} $ , which corresponds to the outflow process. Thus, we can calculate$ {R}_{eff} $ as follows:$$ {R}_{eff}=\frac{Inflow\;process}{Outflow\;process}=\frac{\beta S}{N}\times \frac{1}{\gamma +{d}_{r}}=\frac{\beta S}{\left(\gamma +{d}_{r}\right)N} $$ $ {R}_{0} $ refers to the$ R $ when nearly the entire population is susceptible, which means S is approximately equal to N:$$ {R}_{0}=\frac{\beta }{{d}_{r}+\gamma } $$ The DBM calculates
$ R $ by expressing it as a function of model parameters. This approach proves valuable in the advanced stages of an epidemic as it yields results with significant explanatory power. However, its applicability is limited to single-host and single-kinetic models, thus restricting its use in multi-host or co-kinetic models. The DBM incorporates both the disease’s natural history and demographic parameters, rendering it meaningful for predicting and preventing outbreaks. Moreover, it is renowned for its simplicity, ease of comprehension, and minimal hardware or software requirements. -
The next-generation method (NGM) serves as a prevalent approach for the estimation of
$ R $ . This method utilizes the maximum eigenvalue of the next-generation matrix within a dynamic model following the method proposed by Van den Driesschie and Watmough (29-33). NGM is frequently applied across a range of dynamic models including, but not limited to, the SIR and SEIS models (25). Furthermore, it delivers quantitative accounts of secondary infections and can estimate the percentage of undetected cases across diverse outbreak scenarios (29,34). Compartments within these dynamic models are differentiated based on their infectivity. The ‘x-group’ signifies compartments possessing infectivity, whereas the ‘y-group’ denotes compartments devoid of infectivity. The equations corresponding to these groups are presented below:$$ \frac{{dx}_{i}}{dt}={{F}}_{i}\left(x,y\right)-{{V}}_{i}\left(x,y\right)i=1,\dots ,n $$ $$ \frac{{dy}_{j}}{dt}={{G}}_{j}\left(x,y\right)j=1,\dots ,m $$ $ {{F}}_{i} $ represents the newly infected individuals in compartment i,$ {{V}}_{i} $ represents individuals who transit to other compartments. To illustrate NGM, we will continue using the SIR model as an example. In the SIR model, where n and m are 1 and 2, respectively, and with x = I and y = (S, R), the corresponding equations are as follows:$$ {{F}}_{1}=\frac{\beta SI}{N} $$ $$ {{V}}_{1}=\gamma I+{d}_{r}I $$ $$ {{G}}_{1}={b}_{r}N-\frac{\beta SI}{N}-{d}_{r}S $$ $$ {{G}}_{2}=\gamma I-{d}_{r}R $$ Taking derivatives of F and V to I, one obtains the Jacobi matrix:
$ F=\beta S/ N$ , and$ V=\gamma +{d}_{r} $ . And$ {R}_{eff} $ is the real part of the leading eigenvalue of the next-generation matrix ($ F{V}^{-1} $ ) 25:$$ {R}_{eff}=\rho \left(F{V}^{-1}\right)=\frac{F}{V}=\frac{\frac{\beta S}{N}}{\gamma +{d}_{r}}=\frac{\beta S}{\left(\gamma +{d}_{r}\right)N} $$ $$ {R}_{0}=\frac{\beta }{\gamma +{d}_{r}} $$ Nevertheless, the application of the NGM method to multi-group or multi-host compartmental models exhibits certain limitations. This method exclusively ascertains the stability threshold of a disease-free equilibrium, displaying a deficiency in explicit explanatory power. Employing smaller data sets during the initial phases of an epidemic may result in the omission of pivotal information. Over time, there has been a noted enhancement in the quality and dependability of the NGM results. Hence, researchers must modify their methodologies based on specific scenarios. For instance, when studying diseases such as hand, foot, and mouth disease, it might be plausible to exclude certain factors like the short disease duration, mobility of patients, and spatial structure.
-
The final-size equation (FSE) is a valuable tool for comprehending the relationship between the outcome of an epidemic and
$ {R}_{0} $ , while taking into account the proportions of susceptible and recovered individuals. In the SIR model, the calculation formula is as follows:$$ {R}_{0}=\frac{ln\dfrac{{S}_{0}}{{S}_{\mathrm{\infty }}}}{1-{S}_{\mathrm{\infty }}} $$ Where
$ {S}_{0} $ and$ {S}_{\infty } $ represent the initial and final proportions of susceptible individuals.FSE is often employed in the SIR model to ascertain the ultimate scale of an epidemic (35). With its precise data output and straightforward equation form, it is well suited to facilitate initial estimates following the conclusion of an epidemic. Nonetheless, its use is model-specific and necessitates fresh derivation for application for other models, which can prove challenging for complex dynamic models.
It has been definitively established that the FSE possesses a unique solution in three mean field models, namely homogeneous, pairwise, and heterogeneous. Moreover, linearizing the FSE facilitates the transformation of optimal vaccination issues into simpler knapsack problems, yielding practical insights for decision-makers and the general public when considering vaccination strategies (36-37). However, a gap exists with respect to the availability of an R package incorporating displacement or interaction for the calculation of Rt using the FSE approach (38).
-
The method based on generation interval is frequently utilized to estimate
$ {R}_{t} $ in the field of epidemiology. This approach leverages the concept of the generation gap, defined as the duration between the infection of a primary case and the consequent infection of secondary cases. This method streamlines the natural history of the illness by concentrating on the distribution of time intervals among generations. Within this framework, two key indicators are emphasized: the generation interval (GT) and the serial interval (SI). GT signifies the duration between infection incidents in an infector-infected pair, whereas SI symbolizes the time from symptom onset in these pairs (39). Accurate estimation of GT becomes demanding as it is dependent on an exhaustive investigation of contact history (40). In comparison, SI’s determination is less challenging as symptoms can be readily detected during field epidemiological surveys (41). By quantifying the relationship between generations using SI, researchers can estimate$ {R}_{t} $ ,$ {R}_{eff} $ , and$ {R}_{0} $ (42-44).Several R (version 4.3.0, R Core Team, Vienna Austria) packages, namely EpiEstim, EpiNow2, and R0, currently facilitate the computation of regeneration numbers based on GT or SI (15,45-46), thereby significantly lowering the barrier to their utilization. We have developed an interactive application for users unfamiliar with the R language, particularly grassroots disease control staff. This application, called Reproduction Number Calculator, enables access to these R packages without necessitating knowledge of programming (available at https://toolbox.ctmodelling.cn/). However, it is crucial to acknowledge the method’s inherent limitations. Inaccuracies may arise if the assumed distribution of intergenerational times does not accurately reflect the dynamics of the disease (42). This uncertainty in distribution can potentially result in an underestimation of R’s uncertainty (15). Oversights related to group immunity and infection staging can create bias when estimating
$ {R}_{eff} $ (42). Further, the generation interval-based method comes with specific demands and limitations, such as a need for clear transmission chains, comprehensive and timely data, and an accurate intergenerational time distribution assumption. These factors may limit its utility in certain scenarios.In conclusion, the generation interval-based method provides valuable insights into disease transmission dynamics and facilitates the estimation of
$ {R}_{t} $ , Reff and$ {R}_{0} $ . However, researchers should exercise caution in interpreting the results and consider the assumptions and data requirements associated with the method.
Methodology Based on Definitions:
Methodology Based on Next-Generation:
Equation for Determining Final Size:
Methodology Based on Generation Intervals:
Citation: |