Advancements in genomic research reveal alternative transcription initiation sites in thousands of soybean genes
Rosalind Franklin, James Watson and Francis Crick discovered the structure of DNA — that molecular blueprint for life — over 70 years ago. Today, scientists are still uncovering new ways to read it.
In 2010, Jianxin Ma, a professor of agronomy, and his collaborators built the first reference genome for soybeans on the widely studied Williams 82 variety. Thousands of scientists and plant breeders have since used that genome in their own research on the genetic makeup underlying various characteristics, such as seed protein and oil content, plant architecture and productivity, and disease resistance and abiotic stress tolerance in soybeans.
Through the last decade, Ma, who is the Indiana Soybean Alliance Inc. Endowed Chair in Soybean Improvement, has been recognized internationally for his contribution to the soybean genome as well as for his continued research and innovation in the field. His most recent work, published in The Plant Cell, used advancements in genomic research to fill in gaps of the original soybean reference genome.
“The reference genome was like a dictionary when we announced it,” Ma said. “Each gene was like a single word. However, there was a piece of critical information lacking: transcription initiation sites for individual genes.”
Transcription initiation sites are locations in the DNA where a specialized transcription-factor protein can attach and then build an mRNA copy of the gene in front of it. That mRNA is read and translated at a cell’s ribosome to create more proteins, important for the chemical and physical function of every organism.
Knowing where the mRNA begins formation on the DNA strand is a significant part of understanding how genes are expressed. These initiation sites contain regulatory elements and provide information to the cell about when and where to transcribe each gene to make protein, and how frequently to do so at any point in time.
In genetics, it has generally been accepted that each gene has one transcription initiation site, located downstream of a core promoter region and typically around a TATA box — a DNA sequence rich in thymine and adenine repeats. But Ma and his colleagues no longer think this is the case.
“There is a set of predicted transcription start sites for over 50,000 genes in soy, but based on our new study, less than 3% of those predicted transcription initiation sites actually are correct,” Ma said.
In 2020, the development of the Survey of TRanscription Initiation at Promoter Elements Sequencing (STRIPE-seq) technique offered Ma’s lab an effective, efficient, faster and more affordable way to identify transcription initiation sites across the entire soybean genome. It also provided information about the relative abundance of every mRNA copy, which gives clues as to how much a gene is expressed in different tissues and times.
With funding from the United States Department of Agriculture’s National Institute of Food and Agriculture (USDA-NIFA) and the National Science Foundation, Ma and his lab performed STRIPE-seq analyses on eight different tissues in soybean: leaves, stems, stem tips, roots, nodules, flowers, pods and developing seeds. Even though the plant’s DNA is consistent across these tissues, the expression of genes differs.
In their recent paper, the Ma lab identified transcription initiation sites for about 40,000 genes in soy. They discovered widespread alternative transcription initiation sites outside of the TATA box region and other sequences thought to be promoters. Some newly identified sites actually occur in the coding sequence of the gene that becomes an mRNA. Thus, transcription-factor proteins can bind to several different sections of the gene and begin making mRNA, each copy different from ones started at other sites. Each alternative transcription site could potentially create a different protein from the same gene.
One specialized subset of transcription initiation sites the group found was in root nodules, a structure on legumes’ roots that harbors interaction between the plant and Rhizobia bacteria. These soil-dwelling microbes fix nitrogen for specialized plants like legumes in return for sugars and protection. This symbiosis increases a plant’s survival in nitrogen-deficient soils without the use of nitrogen fertilizers.
“We found these particular transcription initiating sites in nodules, but not in the roots or any other tissues, suggesting they are for tissue-specific transcription and associated with nodule-specific function,” said Ma.
In order for DNA to fit within a cell’s nucleus, it is wound up around histone proteins to form a structure called “chromatin.” Depending on chemical markers placed on these histones, the chromatin can be wound tightly — preventing transcription factors from binding — or loosely, making it accessible for generating mRNA copies. Ma believes that these “epigenetic” changes are working hand-in-hand with the alternative transcription initiation sites in gene expression. Different transcription initiation sites can become available as a gene is tightened or loosened, and different proteins may be created.
“We have found nearly 7,000 genes that have the alternative transcription initiation within the coding sequences. These alternative transcription initiation sites tend to be tissue-specific and associated with histone modifications,” Ma said.
Evolutionarily, these alternative sites may have been beneficial to soybeans and other plants because they allowed for increased complexity and adaptability under a limited genome. Soybeans have experienced two whole-genome duplication events throughout their history, both several millions of years ago. Although some of the duplicated genes have since been lost, Ma thinks the duplication events may have given rise to altered or alternative transcription sites.
“After duplication, the majority of genes are still in pairs; however, they show different expression patterns, and many have functionally diverged to regulate different traits,” Ma said. “They start to transcribe from different sites, potentially contributing to their functional divergence.”
Currently, Ma is coordinating with USDA Agricultural Research Service scientists Rex Nelson and Jacqueline Campbell on making this research data accessible for others, just as he did with the original reference genome. The group is adding the data to SoyBase, a collaborative online database for soybean research.
Nelson, curator of SoyBase, explained, “having even a potential transcription start site will aid in the analysis of soybean gene promoter regions. This may shed light on the proteins that interact with promoters and induce transcription.”
Campbell, co-curator of the database, added that “the identification of transcription factors that bind promoter regions will allow researchers to identify gene regulatory interaction networks involved in the complex regulation of genes in agronomical important phenotypes.”
Ma is honored to give to the research community again. “The database serves as an important resource for both basic and applied research,” he said. “By making our data available there, we catalyze further research in understanding gene functions, regulatory mechanisms, gene networks and genetic variations associated with specific traits of interest. As we better understand how these alternative transcription sites affect particular traits, the hope is to see this lead to better soybean varieties.”
About Purdue Agriculture
Purdue University’s College of Agriculture is one of the world’s leading colleges of agricultural, food, life and natural resource sciences. The college is committed to preparing students to make a difference in whatever careers they pursue; stretching the frontiers of science to discover solutions to some of our most pressing global, regional and local challenges; and, through Purdue Extension and other engagement programs, educating the people of Indiana, the nation and the world to improve their lives and livelihoods. To learn more about Purdue Agriculture, visit this site.
About Purdue University
Purdue University is a public research institution demonstrating excellence at scale. Ranked among top 10 public universities and with two colleges in the top four in the United States, Purdue discovers and disseminates knowledge with a quality and at a scale second to none. More than 105,000 students study at Purdue across modalities and locations, including nearly 50,000 in person on the West Lafayette campus. Committed to affordability and accessibility, Purdue’s main campus has frozen tuition 13 years in a row. See how Purdue never stops in the persistent pursuit of the next giant leap — including its first comprehensive urban campus in Indianapolis, the Mitch Daniels School of Business, Purdue Computes and the One Health initiative — at https://www.purdue.edu/president/strategic-initiatives
Writer: Lindsey Berebitsky
Journal
The Plant Cell
Subject of Research
Cells
Article Title
Noncanonical transcription initiation is primarily tissue specific and epigenetically tuned in paleopolyploid plants
Article Publication Date
14-Nov-2024
COI Statement
The authors declare that they have no competing interests.