Foodborne contaminations are a global burden on public health worldwide. For many years, and until recently, the microbial contaminants have always been analyzed and characterized after isolation. Since the advent of second generation sequencing, it has been possible to characterize the complete genome of these isolates to the SNP level. It was proven that whole genome sequencing has a higher resolution than any combination of tests carried out so far. But the isolation is not always possible, and it is time-consuming. A new method based on the sequencing of all genetic material of the sample without isolation has become available some years ago, e.g. shotgun metagenomics. It allows to get a screenshot of every microbiological contaminant present in the sample at once, possibly also at the SNP level.
At the time this study started, shotgun metagenomics for the study of food contaminants was in its infancy. Moreover, strain-level characterization had only been achieved in a handful of studies, and had not been proven possible when more than one strain of the same species were present. Moreover, the genome obtained from metagenomics sequences had only rarely been associated with human cases of an outbreak, let alone during a real outbreak. Therefore, this PhD centered on the development of shotgun metagenomics methods to study microbiological foodborne contaminants to the strain level and with achievement of a relatedness study (phylogeny), with a focus on the applicability of the method in order to be easily implemented in reference laboratories in the future.
The method was first tested on minced beef spiked with shiga toxin-producing E.coli (STEC) at very low level (5 CFU/25g). After enrichment for 16 or 24 hours in buffered peptone water, the DNA was extracted with two classical commercial kits or with a kit performing depletion of the host DNA. The extracted DNA also was amplified or not using Phi29 polymerase. We showed that all sample preparation methods allowed to obtain a full characterization to strain level of the spiked strain in the beef sample, carrying another non-pathogenic strain of E.coli.
The simplest protocol was chosen for further studies (i.e. 24 hours enrichment as stated in the current international standard procedure, classical commercial DNA extraction and no amplification). Cheese samples were spiked with two different STEC strains, and we could cluster separately the reads corresponding to each strain and perform relatedness (phylogeny), however not all genes harbored in the isolate’s genome could be retrieved when two strains of the same species were present.
The same protocol was then followed to investigate a real Salmonella foodborne outbreak. Two food samples were investigated and the Salmonella Enteritidis strain linked to the outbreak could be obtained, fully characterized, and related to the food and human isolates from the outbreak in a phylogenetic tree containing other Belgian sporadic cases and another Salmonella outbreak happening in Europe at the same time. Therefore, we could resolve the outbreak to its food source, and show the time saved (i.e. about two weeks) with shotgun metagenomics compared to the conventional methods.
The meat previously spiked with STEC was also used to investigate the difference between Illumina short reads or Oxford Nanopore Technologies (ONT) long reads sequencing. We showed that the same level of information could be obtained after sequencing with either of the two technologies, although ONT offered real-time sequencing, and 12 hours were enough to decipher the STEC strain from the endogenous E.coli strains while an Illumina MiSeq run takes 48 hours. Moreover, the lower-cost Flongle flow cell showed the same results after 24 hours of sequencing when using the host depletion DNA extraction method.
We then investigated the issue of the detection and characterization of genetically modified microorganisms (GMMs) in microbial fermentation products as a case study within the problematic of the spread of antimicrobial resistance in the environment. These organisms are genetically modified to enhance the production of a compound (e.g. enzymes, vitamins), and therefore a selection marker is often used to detect the bacteria who have included the modification in their genome. Antimicrobial resistance genes are often used as one of these markers. The construct may also include dependency to certain growing conditions, which hinders culturing or obtaining an isolate, in particular when the GMM is unknown. For these reasons, shotgun metagenomics was considered a good alternative to detect and characterize the contaminant, and no enrichment was conducted on these samples, that are considered as non-complex matrices as most of the DNA contamination should belong to the producing GMM if present. We showed that we were able to detect unnatural associations including AMR genes after sequencing all DNA in the samples, confirming the presence of a GMM and characterizing it.
Finally, we also investigated the contamination of food by viral pathogens such as norovirus and hepatitis A. In order to detect these RNA viruses, we extracted all RNA from the food (raspberry, bivalve). These samples were not enriched as it is particularly arduous to culture these viruses in laboratory conditions. Because the contamination level was low in a complex matrix, we tested several sample preparation methods that could enhance the detection of the virus in the sample or during the sequencing (i.e. adaptive sampling). Overall, we showed that shotgun metagenomics, with or without amplification, gave satisfactory results for moderate contamination levels (higher than 107 genome copies). Moreover, depending on the RNA extraction method, it might be used even for lower contamination levels (103 genome copies). Finally, a targeting of the norovirus by hybridization capture enhanced the relative quantity of reads classified as norovirus but at the expense of using a less open approach that can only characterize one viral species in the sample.
Overall, this thesis advanced the scientific knowledge about shotgun metagenomics for the study of food contaminants by attaining for the first time strain level resolution in samples with more than one strain of the same species, with both long and short reads sequencing technologies. Moreover, it offered proof of concepts of the feasibility of such a method, as asked by EFSA in a recent scientific opinion. This work also gave a precedent in outbreak resolution to the food source using metagenomics to the strain level, and detecting GMMs in microbial fermentation products. And finally, it presented clearly to the scientific community which sample preparation methods can or cannot allow to detect viral pathogens at low
contamination levels in food samples without enrichment. All protocols that have been proposed have been chosen to be as close as possible to the ones currently used in the reference laboratories so they can be adapted to be used in routine in the future. Ultimately, a validation of the method is still necessary in order to obtain a precise limit of detection based on the analysis of a large dataset of samples. Moreover, other technologies can still be investigated to improve the results in particular when skipping the culture enrichment of the food and other applications for public health can be explored as well such as the analysis of the human microbiomes or the emergence of new contaminants.