BACKGROUND: Differences in the genetic material of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants may result in altered virulence characteristics. Assessing the disease severity caused by newly emerging variants is essential to estimate their impact on public health. However, causally inferring the intrinsic severity of infection with variants using observational data is a challenging process on which guidance is still limited. We describe potential limitations and biases that researchers are confronted with and evaluate different methodological approaches to study the severity of infection with SARS-CoV-2 variants.
METHODS: We reviewed the literature to identify limitations and potential biases in methods used to study the severity of infection with a particular variant. The impact of different methodological choices is illustrated by using real-world data of Belgian hospitalized COVID-19 patients.
RESULTS: We observed different ways of defining coronavirus disease 2019 (COVID-19) disease severity (e.g., admission to the hospital or intensive care unit versus the occurrence of severe complications or death) and exposure to a variant (e.g., linkage of the sequencing or genotyping result with the patient data through a unique identifier versus categorization of patients based on time periods). Different potential selection biases (e.g., overcontrol bias, endogenous selection bias, sample truncation bias) and factors fluctuating over time (e.g., medical expertise and therapeutic strategies, vaccination coverage and natural immunity, pressure on the healthcare system, affected population groups) according to the successive waves of COVID-19, dominated by different variants, were identified. Using data of Belgian hospitalized COVID-19 patients, we were able to document (i) the robustness of the analyses when using different variant exposure ascertainment methods, (ii) indications of the presence of selection bias and (iii) how important confounding variables are fluctuating over time.
CONCLUSIONS: When estimating the unbiased marginal effect of SARS-CoV-2 variants on the severity of infection, different strategies can be used and different assumptions can be made, potentially leading to different conclusions. We propose four best practices to identify and reduce potential bias introduced by the study design, the data analysis approach, and the features of the underlying surveillance strategies and data infrastructure.