The ‘Dark Genome’- Key to Our Survival for 75 Million Years Since the Mouse & Human Lineage Diverged

F1.medium The importance of what scientists are calling the “dark genome” became apparent in 2001, when the human genome was first published. Scientists expected to find as many as 100,000 genes packed into the 3 billion bases of human DNA; they were startled to learn that there were fewer than 35,000. (The current count is 21,000.) Protein-coding regions accounted for just 1.5% of the genome.

Could the rest of our DNA really just be junk? Far from being humble messengers, RNAs of all shapes and sizes are now seen to be powerful players in how genomes operate. In fact, gene regulation has turned out to be a surprisingly complex process governed by various types of regulatory DNA, which may lie deep in the wilderness of supposed “junk.” Far from being humble messengers, RNAs of all shapes and sizes are actually powerful players in how genomes operate. There's been increasing recognition of the widespread role of chemical alterations called epigenetic factors that can influence the genome across generations without changing the DNA sequence itself.

The deciphering of the mouse genome in 2002 showed that there was an untold story. Mice and people turned out to share not only many genes but also vast stretches of noncoding DNA. To have been “conserved” throughout the 75 million years since the mouse and human lineages diverged, those regions were likely to be crucial to the organisms' survival.

Edward Rubin and Len Pennacchio of the Joint Genome Institute in Walnut Creek, California, and colleagues figured out that some of this conserved DNA helps regulate genes, sometimes from afar, by testing it for function in transgenic mouse embryos. Studies by the group and others suggested that noncoding regions were littered with much more regulatory DNA than expected.

Further evidence that noncoding DNA is vital, has come from studies of genetic risk factors for disease. In large-scale searches for single-base differences between diseased and healthy individuals, about 40% of the disease-related differences show up outside of genes.

Genetic dark matter also surfaced when scientists surveyed exactly which DNA was being transcribed, or decoded, into RNA. Scientists thought that most RNA in a cell was messenger RNA generated by protein-coding genes, RNA in ribosomes, or a sprinkling of other RNA elsewhere. But surveys by Thomas Gingeras, now at Cold Spring Harbor Laboratory in New York, and Michael Snyder, now at Stanford University in Palo Alto, California, found a lot more RNA than expected, as did an analysis of mouse RNA by Yoshihide Hayashizaki of the RIKEN Omics Science Center in Japan and colleagues.

validation soon came from Ewan Birney of the European Bioinformatics Institute and the Encyclopedia of DNA Elements project, which aims to determine the function of every base in the genome. The 2007 pilot results were eye-opening: Chromosomes harbored many previously unsuspected sites where various proteins bound—possible hotbeds of gene regulation or epigenetic effects. Strikingly, about 80% of the cell's DNA showed signs of being transcribed into RNA. But what the RNA was doing was unclear.

Other studies revealed that RNA plays a major role in gene regulation and other cellular functions. The story started to unfold in the late 1990s, when plant researchers and nematode biologists learned to use small RNA molecules to shut down genes. Called RNA interference (RNAi), the technique has become a standard way to control gene activity in a variety of species, earning a Nobel Prize in 2006.

To understand RNAi and RNA in general, researchers began isolating and studying RNA molecules just 21 to 30 bases long. It turned out that such “small RNAs” can interfere with messenger RNA, destabilizing it. Four papers in 2002 showed that small RNAs also affect chromatin, the complex of proteins and DNA that makes up chromosomes, in ways that might further control gene activity. In one study, yeast missing certain small RNAs failed to divide properly. Other studies have linked these tiny pieces of RNA to cancer and to development.

In 2007, a group led by Howard Chang of Stanford and John Rinn, now at Beth Israel Deaconess Medical Center in Boston, pinned down a gene-regulating function by so-called large intervening noncoding RNAs. Rinn and colleagues later determined that the genome contained about 1600 of these lincRNAs. They and other researchers think this type of RNA will prove as important as protein-coding genes in cell function.

Unchartered territory about the genome's dark matter are still under study. But the vital importance of "junk" RNA is now crystal clear.

To initiate many important functions, bacteria sometimes depend entirely upon ancient forms of RNA, once viewed simply as the chemical intermediary between DNA's instruction manual and the creation of proteins, said Ronald Breaker, the Henry Ford II Professor of Molecular, Cellular and Developmental Biology at Yale.

Proteins carry out almost all of life's cellular functions today, but many scientists like Breaker believe this was not always the case and have found many examples in which RNA plays a surprisingly large role in regulating cellular activity. In bacteria, at least – proteins are not always necessary to spur a host of fundamental cellular changes, a process Breaker believes was common on Earth some 4 billion years ago, well before DNA existed.

"How could RNA trigger changes in ancient cells without all the proteins present in modern cells? Well, in this case, no proteins, no problem," said Breaker, who is also a Howard Hughes Medical Institute investigator.

Breaker's lab solved a decades-old mystery by describing how tiny circular RNA molecules called cyclic di-GMP are able to turn genes on and off. This process determines whether the bacterium swims or stays stationary, and whether it remains solitary or joins with other bacteria to form organic masses called biofilms. For example, in Vibrio cholerae, the bacterium that causes cholera, cyclic di-GMP turns off production of a protein the bacterium needs to attach to human intestines.

The tiny RNA molecule, comprised of only two nucleotides, activates a larger RNA structure called a riboswitch. Breaker's lab discovered riboswitches in bacteria six years ago and has since shown that they can regulate a surprising amount of biological activity. Riboswitches, located within single strands of messenger RNA that transmit a copy of DNA's genetic instructions, can independently "decide'' which genes in the cell to activate, an ability once thought to rest exclusively with proteins.

Breaker had chemically created riboswitches in his own lab and – given their efficiency at regulating gene expressions – predicted such RNA structures would be found in nature. Since 2002, almost 20 classes of riboswitches have been discovered, mostly hidden in non-gene-coding regions on DNA.

"We predicted that there would be an ancient 'RNA city' out there in the jungle, and we went out and found it,'' Breaker said.

Bacterial use of RNA to trigger major changes without the involvement of proteins resolves one of the questions about the origin of life: If proteins are needed to carry out life's functions and DNA is needed to make proteins, how did DNA arise?

The answer is what Breaker and other researchers call the RNA World. They believe that billions of years ago, single strands of nucleotides that comprise RNA were the first forms of life and carried out some of the complicated cellular functions now done by proteins. The riboswitches are highly conserved in bacteria, illustrating their importance and ancient ancestry, Breaker said.

Understanding how these RNA mechanisms work could lead to medical treatments as well, Breaker noted. For instance, a molecule that mimics cyclic di-GMP could be used to disable or disarm bacterial infections such as cholera, he said.

Casey Kazan via and Yale University


"The Galaxy" in Your Inbox, Free, Daily