TY - JOUR T1 - A Bioinformatics Whole-Genome Sequencing Workflow for Clinical Mycobacterium tuberculosis Complex Isolate Analysis, Validated Using a Reference Collection Extensively Characterized with Conventional Methods and In Silico Approaches JF - J Clin Microbiol Y1 - 2021 A1 - Bert Bogaerts A1 - Thomas Delcourt A1 - Karine Soetaert A1 - Samira Boarbi A1 - Pieter-Jan Ceyssens A1 - Raf Winand A1 - Julien Van Braekel A1 - Sigrid C.J. De Keersmaecker A1 - Nancy Roosens A1 - Marchal, Kathleen A1 - Vanessa Mathys A1 - Kevin Vanneste KW - Antimicrobial resistance KW - Mycobacterium tuberculosis KW - national reference center KW - public health KW - single nucleotide polymorphism KW - VALIDATION KW - whole genome sequencing AB -

The use of whole genome sequencing (WGS) for routine typing of bacterial isolates has increased substantially in recent years. For (MTB), in particular, WGS has the benefit of drastically reducing the time to generate results compared to most conventional phenotypic methods. Consequently, a multitude of solutions for analyzing WGS MTB data have been developed, but their successful integration in clinical and national reference laboratories is hindered by the requirement for their validation, for which a consensus framework is still largely absent. We developed a bioinformatics workflow for (Illumina) WGS-based routine typing of MTB Complex (MTBC) member isolates allowing complete characterization including (sub)species confirmation and identification (16S, /RD, , Single Nucleotide Polymorphism (SNP)-based antimicrobial resistance (AMR) prediction, and pathogen typing (spoligotyping, SNP barcoding, and core genome MultiLocus Sequence Typing). Workflow performance was validated on a per-assay basis using a collection of 238 in-house sequenced MTBC isolates, extensively characterized with conventional molecular biology-based approaches supplemented with public data. For SNP-based AMR prediction, results from molecular genotyping methods were supplemented with modified datasets allowing to greatly increase the set of evaluated mutations. The workflow demonstrated very high performance with performance metrics >99% for all assays, except for spoligotyping where sensitivity dropped to ∼90%. The validation framework for our WGS-based bioinformatics workflow can aid standardization of bioinformatics tools by the MTB community and other SNP-based applications regardless of the targeted pathogen(s). The bioinformatics workflow is available for academic and non-profit usage through the Galaxy instance of our institute at https://galaxy.sciensano.be.

M3 - 10.1128/JCM.00202-21 ER -