Wednesday, 31 August 2016 17:45

Microbial Genomics and the Future of Food Microbiology

Written by 
Published in mBiosphere

There’s no question that foodborne disease is a serious problem.  Illnesses from contaminated foods cause over 128,000 hospitalizations and 3,000 deaths each year in the United States. The Centers for Disease Control and Prevention estimates that one in six Americans suffers a food-related illness each year, so even those of us without a doctor’s bill may miss work due to an irritated digestive system. In addition to missed work days, recalls of contaminated ingredients are extremely costly, with millions of pounds of food taken out of production each year. Food safety affects our health and economy at the individual and societal levels.

Imagine, then, working at a government agency charged with protecting citizens from potentially contaminated foods. The Food and Drug Administration (FDA) monitors the safety of food production locations, tracks down the sources of outbreaks, and issues recalls of contaminated products, making this organization key to maintaining health and safety. As recently as 10 years ago Eric Brown, Director of the Division of Microbiology in the Office of Regulatory Science at the FDA, and his staff relied on pulse-field gel electrophoresis (PFGE), a technique developed in the 1980s, to distinguish bacteria isolated from contaminated food.. This technique distinguishes between most bacterial strains, but finding links between contaminated food and illness can be problematic when strains can’t be differentiated.

WGS3                                                                   Jalapenos were the source of a hard-to-pinpoint 2008 Salmonella outbreak. Source.

PFGE relies on different restriction patterns between strains, so that the same restriction enzyme applied to two different genomes will reveal different sized DNA fragments. However, the fragments themselves are indistinguishable other than by size, and don’t reveal any single-nucleotide polymorphisms (SNPs) that may exist between two isolates. If PFGE can’t distinguish between two isolates, identifying the source of a contaminant becomes almost impossible. To address this problem, Brown and his colleagues have pioneered the use of techniques that are changing the field of food microbiology.


The Summer of Salmonella Saintpaul


Similarity in strains became an issue in 2008, during a nationwide outbreak of Salmonella. The FDA was using conventional tools like PFGE, epidemiology, and traceback to investigate the source of the contaminating strain, and evidence kept pointing back to tomatoes. So strong was the evidence that many companies took tomatoes off their lines or out of their menus, leading to an estimated $100 million loss in the tomato industry, but still the outbreak continued. After three months, the cause was finally pinpointed to jalapeno peppers. The pressure to find better surveillance techniques to more quickly identify contaminated foods led Brown and his colleagues to turn to the rapidly evolving technologies of genomic microbiology.

“That 2008 Salmonella Saintpaul event took us through tomatoes and into hot peppers and took an entire summer, but we came out of it much stronger,” says Brown. That’s because this precipitating event led to their exploring novel genomic tools to better guide traceback events. The research team initially introduced next-generation sequencing (NGS) technologies as a way to identify new targets for outbreak differentiation, by generating whole-genome sequences (WGS) for the microbial isolates and identifying nucleotide differences. But it became apparent during another Salmonella outbreak, this time traced to salami, that the sequencing technology itself was the most promising tool.

SNP                                                                         WGS differentiates every single nucleotide polymorphism (SNP), which other technologies can't do. Source.

The outbreak involved not your run-of-the-mill salami, but an artisanal selection that included nuts, cheeses, and spices as part of its makeup. “Believe it or not,” tells Brown, “but all these ingredients had had previous events of the same Salmonella that was coming out of the spiced meat now! It was impossible to figure out which ingredient was the cause, because they all had the same matching Salmonella from the year or two before.” At that point, the researchers turned to WGS to differentiate the isolates. Working with the bioinformatics team at the FDA, the scientists compared patient isolates to those from the current outbreak, quickly ruling out the nuts and cheeses and identifying the black and red pepper as the culprit: “Across 4.6 million base pairs, the black and red pepper were only 1 or 2 base pairs removed from the patient isolates. That kind of forensic power was just unheard of! At that moment, Errol and I realized it was time to join forces and put our groups together and really make this thing work from soup to nuts.”

Brown is referring to Errol Strain, the Director of the Biostatistics and Bioinformatics Staff in the Office of Analytics and Outreach at the FDA Center for Food Safety and Applied Nutrition. He and Brown have worked together closely to develop microbial genomics as the primary method for microbial identification and sourcing. Their work paid off during a 2010 outbreak associated with eggs as a major tipping point that convinced management that WGS was the right technology for the FDA. Over 500 million eggs had been recalled, and the FDA had only recently acquired sequencing machines. Traditional molecular methods weren’t able to differentiate different strains, but using these machines to analyze roughly 100 isolates “provided dense resolution, where we could track down to the farm level the isolates associated with this outbreak,” says Strain.

“When we sequenced those isolates, it was literally like flipping on the Hubble telescope,” says Brown about the resolution difference WGS has made.


Collaboration Is Key


Applying WGS for microbial identification requires good techniques at the bench and a strong programming skill set. Both Brown and Strain are quick to acknowledge the importance of collaboration. The work requires collaboration within teams because no individual has all the skills needed to answer the question. Microbiologists, geneticists, bioinformaticists, and epidemiologists must all work together as a team to synthesize data into a useful format.

Collaborations between institutions are necessary to broadly survey food producers and stop potential outbreaks quickly once contamination is detected. As a government agency, the FDA has found natural allies in the Centers for Disease Control, the United States Department of Agriculture, and state governmental agencies. But the forces of globalization in the world economy are reflected in our food supply and many international alliances have been forged as well.

In 2011, the first Global Microbial Identifier (GMI) meeting was held, hosted by the Danish Food Institute, headed by Jørgen Schlundt. The international meeting allowed scientists working on similar problems to share validation techniques, policy changes, and outreach to emphasize the potential of WGS to world leaders. Brown says that WGS “has been a huge global movement, and what I love about it is that because we came out together, we met early at the beginning and prevented the stovepiping of all of these different tools, different languages, different curation sites, and different databases. Because we did things together globally, the effort has kept us as a world in one public health community.” GMI9 was held this past May in Rome, Italy. In 2015, ASM hosted its first Conference on Rapid Next-Generation Sequencing. Over four days, 350 world experts discussed next-generation sequencing technologies associated with food and clinical health, with so much success that the next meeting is planned for September 2017.

Scientific meetings let experts discuss best practices for analysis pipelines and data storage. There’s a strong open-access culture among the scientists using these techniques, in part because the same data set can potentially answer multiple questions. “Different institutions have different ways they like to analyze the data,” says Strain, “but the raw data is essentially in the same format.” A scientist at the CDC might value time, running a program to produce an answer in two minutes to figure out if two clinical cases of illness are related. Scientists at the FDA may need to serve an injunction to shut down operations of a firm, and may take a little longer to run the data analysis and generate multiple validations. “What’s key is the sharing of the raw data,” says Strain.

Sharing data is what Brown and Strain had in mind when they and others at the FDA initiated the GenomTrakr network in 2012. This database was initiated as a pilot food safety network, built to hold the genetic information collected by FDA scientists. It now holds nearly 70,000 genomes as the result of collaborations between the FDA, the CDC, the National Library of Medicine, the NCBI, and the National Center for Biotech Information. These genomic sequences are publicly available, helping FDA improve food safety, but also helping NCBI identify additional pathogen through their pathogen detection porthole. GenomeTrakr itself has sequences fed in from 15 federal labs, 20 state health and university labs, 1 U.S. hospital lab, and 6 labs located outside of the U.S.


From Genotype to Phenotype and Beyond


Once a genome sequence is deposited into a database like the GenomeTrakr, the possibilities become almost endless. Of course the sequence can be used to identify species and track isolates during an outbreak, but it can also tell you about the characteristics of that particular microbe: Does it have drug resistance determinants in its sequence? What toxins does it have? What is its serotype? Brown explains, “We’re doing this for outbreak resolution and traceback, and everything else you’re getting for free. We’re generating the data, making the data public, and people can screen it for phenotypic characterizations.” In other words, if you have a question, GenomTrakr may already hold the answer. And because of the accessible nature of the sequences, the database is quickly becoming a valuable resource for many microbiologists.

Today, all samples must be cultured and have DNA extracted from a pure isolate before the genomic sequence can be read on one of the Illumina MiSeq sequencers housed in the FDA. The future looks different to Brown and Strain, who see the changing landscape of a subfield of NGS sequencing, metagenomics. Metagenomics allows assessment of an entire community’s DNA, which in the case of food surveillance includes both microbes and food ingredients. Detecting and measuring pathogen DNA in these mixed sequences without amplification is the next big step for sequencing: to get all the valuable information from a microbial genome without having to purify the contaminant. “The great challenge now is how we get genomics to make that next leap to metagenomics, where metagenomics means culture-independent sequencing of the entire sample,” says Brown. “Once we do that, it will truly be one microbiology.”

Last modified on Wednesday, 07 September 2016 15:32
Julie Wolf

Julie Wolf is the ASM Science Communications Specialist. She contributes to the ASM social media and blog network and hosts the Meet the Microbiologist podcast. She also runs workshops at ASM conferences to help scientists improve their own communication skills. Follow Julie on Twitter for more ASM and microbiology highlights at @JulieMarieWolf.

Julie earned her Ph.D. from the University of Minnesota, focusing on medical mycology and infectious disease. Outside of her work at ASM, she maintains a strong commitment to scientific education and teaches molecular biology at the community biolab, Genspace. She lives in beautiful New York City.