be.Prepared (Belgian Preparedness Architecture for Infectious Diseases) is an overarching Belgian infrastructure that facilitates the integration of health and biological data from various sources with whole genome sequencing (WGS) and clinical/epidemiological data to strengthen genomic surveillance and public health response. The infrastructure consists of several interconnected components, including (i) a cloud-based bioinformatics platform for processing microbial WGS data using harmonized bioinformatics pipelines, (ii) a local NRC-usability platform supporting different pathogens, and (iii) the healthdata.be platform for collecting clinical/epidemiological data. Key characteristics of the infrastructure include automated data flows, near real-time data processing to enable integrated genomic-epidemiological analysis and scalability to allow pandemic preparedness. Here, we present two major components of the infrastructure: the bioinformatics platform and the NRC-usability platform.
The bioinformatics platform is hosted in Azure cloud. It processes pseudonymized microbial WGS data (supporting both Illumina and nanopore sequencing), including removal of human DNA and extensive quality control, and runs state-of-the-art pathogen-specific pipelines to obtain high-quality indicators (cgMLST profiles, AMR genes, serotypes etc.) which are stored in a genomic database. The platform performs periodic reanalyses of existing datasets when underlying reference sequence databases are updated ensuring availability of up-to-date genomic indicators. For bacterial pathogens, automated detection of genomic clusters is performed using single-linkage clustering of cgMLST profiles to alert experts at the National Reference Centers for human microbiology (NRCs) of potentially ongoing outbreaks for epidemiological investigation. The cloud-based nature of the platform makes it extremely scalable and able to process increased data volumes during outbreaks and even pandemics.
Obtained genomic indicators are automatically transferred to the central NRC-usability platform, in order to be coupled with clinical/epidemiological data that are also automatically deposited on this platform via healthdata.be. The NRC-usability platform consists of different pathogen-specific instances, accessible only by the responsible NRC. This local platform is based on the Bacterial Isolate Genome Sequence Database (BIGSdb) open-source software, modified to accommodate pathogen-specific genomic indicators and health data. It allows extensive data exploration and visualization. Finally, the NRC-usability platform is also connected to an internal instance of Microreact to help the NRCs follow the spatio-temporal evolution of the infections at the national scale.
be.Prepared currently supports genomic epidemiology for five microbial pathogens, each covered by specific NRCs: Salmonella enterica, Neisseria meningitidis, Listeria monocytogenes, Mycobacterium tuberculosis and Influenza. Efforts are ongoing to expand to additional pathogens. This initiative demonstrates the feasibility of microbial genomic surveillance and outbreak detection at a national scale.