Does a Minimal Genome Exist
Designing Syn 3.0: The First Synthetic Minimal Bacterial GenomeIn 2016, J. Craig Venter Institute and Synthetic Genomics, Inc. reported the design and creation of a synthetic cell containing the supposed minimal number of genes required for life.
The scientists called this ‘new’ bacterium Syn 3.0, but its generation was the culmination of 20 years of research: the sequencing of the Mycoplasma genitalium genome in 1995, the identification of nonessential genes in M. genitalium by transposon mutagenesis in 1999, and the creation of the first synthetic cell, named Syn 1.0, in 2010.
As with many synthetic biology endeavors, the researchers undertook an iterative design-build-test cycle to arrive at what they considered the minimal genome to sustain life: a genome of 473 genes. Though M. genitalium has the smallest known mycoplasma genome (with 525 genes), the researchers used its faster-growing relative, M. mycoides, as the organism for manipulations. M. mycoides doubles every 3-4 hours, whereas M. genitalium doubles every 16-24 hours; once constructed, Syn 1.0 had a doubling time of 1 hour.
The researchers had transposon mutagenesis data from M. genitalium to guide the gene reductions, but they still had to take a trial-and-error approach; removing all genes deemed nonessential for life based on transposon mutagenesis did not result in a viable cell. This is because transposon mutagenesis inactivates only one gene at a time, making predictions of cumulative deletions difficult. Synthetic lethality (for example, genes of redundant functions) is a classic example of the additive effect of gene deletions: a deletion of 1 gene is tolerated by the organism, but the deletion of all genes with the same function is lethal.
To circumvent this, the team divided Syn 1.0’s genome into 8 segments, made deletions to 1 segment, reassembled the reduced segment with the 7 other pieces, and placed it into a recipient, M. capricolum (another relatively fast-growing Mycoplasma species). Fortunately, of the 8 newly constructed genomes, 1 resulted in a viable cell.
To reduce the genome further, the team performed transposon mutagenesis experiments on the hybrid strain. Genes not needed for survival after this step were deleted in further design-build-test cycles resulting in the final 473-gene Syn 3.0 genome.
A streamlined genome doesn’t necessarily make a more efficient cell. It takes a Syn 3.0 cell 3 hours to divide, compared to the 1-hour replication time of Syn 1.0. Escherichia coli, a bacterium with a genome about 8 times the size of Syn 3.0, replicates in 20 minutes, while Saccharomyces cerevisiae divides in 2 hours despite having 6,275 genes on 16 chromosomes (most of the time). Because E. coli and S. cerevisiae are commonly used in microbial engineering projects for their ease of manipulation, an extremely streamlined organism stripped to the bare bones, such as Syn 3.0, may not be useful for microbial engineering projects. Could streamlined E. coli or S. cerevisiae genomes be useful? Maybe. Several attempts have been made, with mixed results ranging from impaired growth to increased cell biomass in batch cultures.
Minimal Genome of Mesoplasma FlorumNow in 2018, a team of researchers from the Université de Sherbrooke examined the Mesoplasma florumgenome. M. florum also contains an extremely small genome (~800 kb) and replicates quickly, with a doublingtime of about 40 minutes. Would a minimal genome derived from M. florum contain similar genes to that of M. mycoides?
Instead of going down the synthetic biology pipeline to identify the minimal genome, these researchers took to computational biology and random transposon mutagenesis to identify essential genes. The computational approach aimed to determine genes important for survival in its plant habitat (fun fact: M. florum was first isolated from the flower of a lemon tree). Using this method, they identified a core genome consisting of 585 genes common between 13 M. florum strains. They then performed transposon mutagenesis to reveal genes essential during laboratory growth. Random transposon mutagenesis on one strain deemed an additional 25 “non core” genes essential. The team has not tested out their minimal genome in cells yet, but when they compared the M. florum genome to that of Syn 3.0, they found some striking differences.
No Definitive Minimal GenomeM. florum and M. mycoides are closely related species belonging to the same class, so you might expect that the minimal genomes derived from each species would be quite similar. Indeed, there is much homology between the M. florum and Syn 3.0 genomes: 409 M. florum genes have homologs in Syn 3.0 with 404 of these genes as part of the M. florum core genome. However, there are still 181 genes from the M. florum core genome that do not have homologs in the Syn 3.0 genome. Moreover, 69 gene families in Syn 3.0 are not found in M. florum, and 57 putatively essential M. florum genes have no homologs in Syn 3.0.
If these organisms are close in phylogeny, why are their hypothetical minimal genomes so different? Is there no definitive minimal genome? One challenge in is the way in which sequences are compared: homology looks at sequence similarity but doesn’t consider gene function. While some functions are essential for life, such as genome replication and protein synthesis, there are multiple nucleotide (and even amino acid) sequences that can encode the same functions. This, along with many genes whose proteins have unknown functions, makes defining the minimal genome difficult.
Another challenge to defining a minimal genome is that many genes encode proteins whose function is only required in certain conditions. For example, the nutrient composition of the growth environment can make certain biosynthesis genes nonessential or vice versa. Syn 3.0 was created in rich medium that supplied nearly all of the essential small molecules, which means that some genes encoding proteins involved in biosynthesis and catabolism, among others, were nonessential, but they would have been required in nutrient-pore environments. In other words: the "perfect" minimal genome is perfect only for the environment in which it’s developed.
Minimal Genomes of NatureA perfect example of how situational the minimal genome can be are the sap-feeding insect endosymbionts. Buchnera aphidicola, a bacterium that lives inside aphids, encodes 362 proteins from a 422 kb genome. Even smaller is the Carsonella ruddii genome, which encodes 182 genes from 158 kb. As endosymbionts, they live in the cytoplasm of their host cells, piggybacking on functions of the host cell, like transcription. Thus, their own genetic machinery is redundant and imposes a fitness cost. Variants that conserve energy by using the host machinery, rather than making their own, have been selected over time, leading to a greatly reduced genome in the current-day endosymbionts. These bacteria have optimized a minimal genome that is extremely specific for their environment.
In the case of C. ruddii, its genome has been so reduced that it might not even be considered an endosymbiont, let alone a living organism. A mutualistic endosymbiotic lifestyle requires the organism to (1) contain genes to autonomously perform life’s essential functions such as transcription, translation, and DNA replication and (2) provide a benefit to the host. C. ruddii does neither. Some have proposed that the C. ruddii may be well on its way to becoming an organelle, as the mitochondria or chloroplast precursors have, or to becoming so reduced that it will disappear over time. C. ruddii and other minimal genome endosymbionts provoke this question: at what point does a genome become so minimal that its host can no longer be considered a living organism?
An organism’s minimal genome is situational, depending on the context of other genes or environmental cues like growth media or host organisms. This means it’s unlikely that scientists will define a single universal genome that prevails as the smallest genome possible. With the difficulty of pinpointing the exact minimal genome and the contextual parameters that change the genes required in each genome, are minimal genomes even worth pursuing in the lab?
Syn 3.0 has shown us that there is still much to learn about the genome: 149 of 473 genes of the Syn 3.0 genome remain a functional mystery, with no characterized homologs. Many of these unassigned genes have been found in a variety of organisms (both prokaryotes and eukaryotes), yet we have no idea what role they play in life. Could there be entire pathways essential for life that remain undiscovered? While minimal genomes may not be ready to meet practical applications, we still have a lot to learn from creating them.