Carnegie Mellon University
November 13, 2019

Research Sheds Light on the Regulation of Gene Expression

By Ben Panko

New research led by Carnegie Mellon University Associate Professor of Biological Sciences Joel McManus reveals how a group of poorly understood nucleotide sequences in our DNA have a major impact on gene expression.

In our cells, DNA acts as a set of instructions for the production of the protein molecules necessary for life. On its own, however, an instruction manual isn't very helpful. That's where RNA comes in, especially messenger RNA (mRNA), which copies and conveys the information encoded in DNA to the cell's ribosomes. In a process known as translation, these ribosomes scan the copied sequences and build corresponding proteins out of 20 different amino acids.

For each of those 20 amino acids, there are multiple combinations of nucleotides (the familiar letters of A, T, C and G) that can prompt ribosomes to make a particular amino acid. "There's redundancy in the genetic code," McManus said. "It turns out this affects the speed of translation."

This is because when the ribosome is building a protein, it has to wait for the right translation RNA (tRNA) molecule to bring the correct amino acid over to link into the chain. Some nucleotide sequences, known as codons, can take longer to translate than others. This has led biologists to identify "fast" and "slow" codons, and the presence of these can impact how much of a certain protein a cell makes — a cell where the ribosomes work more slowly will end up making less of that protein.

The McManus' lab’s interest is on short sequences of nucleotides located right before where the ribosomes start translating the mRNA called upstream Open Reading Frames (uORFs), and how those uORFs can repress or amplify the effects of fast and slow codons farther down the mRNA chain, thus affecting the amount of protein production.

"Up to 50 percent of mammalian genes have at least one of these uORFs," McManus said. "Despite being so common, however, scientists don't fully understand how they work."

In a study published in Nucleic Acids Research, McManus and his team set out to answer that question by designing 4,096 different uORFs in yeast and comparing the amount of protein each cell variant made over time.

"Until our study, most people didn't think these coding sequences in uORFs even mattered," McManus said when it came to gene expression. "The focus had been only on the amount of uORF translation."

On the contrary, McManus' work found that the presence of slow or fast codons within the translated uORFs can have repressive or enhancing effects respectively on the translation process downstream. And the position of the uORFs within the untranslated region of the mRNA can have effects on the protein production.

"It's not understood what makes some of these uORFs suppress expression and some enhance expression," McManus noted. "But our work clearly shows that slow and fast codons are important."

That importance could have major impacts on the differences among cells in our body. Since the amounts of tRNA found in different kinds of human cells vary greatly, so could the effects of the uORFs on translation speed and protein production. "The same sequence in a uORF could be an enhancer in the brain and a repressor in the heart," McManus said.

Similar differences in enhancement versus repression could also be found when cells are stressed, McManus said, and during the development of cancers. In the long term, McManus plans to start looking at uORFs in human cells to figure out how they work under different conditions, and hopefully build a database of "rules" about how they work.

"We're excited to look into this more in the future," McManus said.

Study authors include: Gemma E May, Hunter Kready and Lauren Nazzaro of the Carnegie Mellon Department of Biological Sciences; Yehuda Creeger of the Carnegie Mellon Molecular Biosensor and Imaging Center; Yizhu Lin of the University of California, San Francisco; Peter Spealman of the Center for Genomics and Systems Biology; and Mao Mao of Roche Sequencing Solutions.

The study was funded by the National Institutes of Health (R01 GM121895).