Scientific Understanding of Consciousness
Consciousness as an Emergent Property of Thalamocortical Activity

Database Links Regulatory DNA to its Target Genes


Science 8 May 2015: Vol. 348  no. 6235  pp. 618-619 

New database links regulatory DNA to its target genes

Elizabeth Pennisi


Several major research consortia have delivered what amount to user's manuals for the genome, mapping the locations of thousands of regulatory genomic switches, the specific genes they control, and where in the body they are turned on or off.

The latest and arguably boldest of the “big biology” efforts has yielded preliminary results. By analyzing genetic material gleaned from more than 100 people who had died just hours before, the Genotype-Tissue Expression (GTEx) project portrays gene regulation in action, identifying the genes switched on or off by subtle changes in DNA within 2 million bases of any gene. By evaluating multiple tissues from each body, it also charts the reach of those regulatory sequences across cell types—some affect a gene in all tissues; others are influential in a few tissues or just one.

Earlier efforts took other approaches to mapping the genome's many switches. Two, called BLUEPRINT and the NIH Roadmap Epigenomics Project, chased down the locations of DNA and its associated proteins that are the target of chemical epigenetic marks, which determine whether a gene can be activated. A third, the latest iteration of a 20-year effort called FANTOM (Functional ANnoTation Of the Mammalian genome), provides an extensive catalog of the beginnings of genes and of their control sequences.

Not everyone is persuaded that these massive data-gathering efforts offer much practical help to biologists. “I am not a fan of big science,” says Dan Graur, an evolutionary geneticist at the University of Houston in Texas. Simon Xi, a computational biologist in Cambridge, Massachusetts, who is using the GTEx data in his work on drug development, believes the databases are vital, however, but says they could be more user-friendly in ways to integrate all those data.

The new work aims to address an ongoing source of frustration among disease researchers. A decade ago, geneticists set out to link specific DNA sequences to common diseases. In so-called genome-wide association studies (GWAS), massive consortia pooled tens of thousands of patients and came up with thousands of subtle genetic changes of single nucleotide polymorphisms (SNPs), which appeared to increase the risk of inflammatory bowel disease, schizophrenia, autism, and a whole host of other common disorders. Many of these changes occurred outside genes, suggesting we needed an understanding of regulatory variation.

FANTOM5, a $100 million effort led by the RIKEN institute in Japan, has provided part of the answer by mapping two kinds of regulatory sequences in the genome: “promoters” that help kick off transcription and are located at the start of a gene, and “enhancers,” regulatory DNA that can be far from the genes they act on. FANTOM5 surveyed RNA in every major human organ, hundreds of cancer cell lines, more than 200 purified primary cell types, and in cells at various stages of differentiation.

The $300 million NIH Roadmap Epigenomics project took a different approach to identifying enhancers. It mapped the epigenetic changes that associate with enhancers. For each cell type studied, assays of methylation marks and other changes in the DNA-protein chromatin matrix helped pinpoint enhancers. Based on their sequences, investigators were also able to identify the proteins that help enhancers turn on genes for various embryonic and adult tissues and cell types, including immune, brain, heart, muscle, gut, fat, and skin cells.

The European Union's €30 million BLUEPRINT project took an even deeper look into epigenomes, focusing on white and red blood cells. It determined the epigenomes of the primary blood stem cells and of those cells at various stages in their differentiation into mature white or red cells. Among other goals, BLUEPRINT is looking for differences between these cellular epigenomes in healthy individuals and people with leukemia, whose blood cells proliferate uncontrollably.

Once a GWAS identifies a SNP, data from Roadmap, BLUEPRINT, or FANTOM can provide further evidence that it might influence health by showing whether the variation falls in a regulatory region. GTEx pins down how genetic variation, particularly in noncoding DNA, affects a gene's activity across different parts of the body.

Because the researchers needed multiple tissue samples from internal organs—too many to collect from living people—they turned to recently deceased people whose kin donated their bodies for research. The ultimate goal of the $100 million NIH-funded project is to collect and analyze about 25,000 tissues from 900 individuals; the data published so far include RNA from up to 43 tissue sites from 175 people.

Researchers who are tracking down drug targets for depression, schizophrenia, and Alzheimer's and Parkinson's diseases, are already turning to GTEx data to follow up on SNPs previously implicated in those brain disorders. The data enable researchers to check whether DNA sequences implicated by GWAS are only active in the brain. That could make the sequences especially promising drug targets, reducing the risk of broad side effects.

[end of paraphrase]


Return to —  Autism Spectrum Disorder