Scientific Understanding of Consciousness
Human Transcriptome across Tissues and Individuals
Science 8 May 2015: Vol. 348 no. 6235 pp. 660-665
The human transcriptome across tissues and individuals
Marta Melé. et.al.
Center for Genomic Regulation (CRG), Barcelona, Catalonia, Spain.
Harvard Department of stem cell and regenerative biology, Harvard University, Cambridge, MA, USA.
Department of Genetic Medicine and Development, University of Geneva, Geneva, Switzerland.
Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, Switzerland.
Swiss Institute of Bioinformatics, Geneva, Switzerland.
Facultat de Biologia, Universitat de Barcelona (UB), Barcelona, Catalonia, Spain.
Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain.
Broad Institute of MIT and Harvard, Cambridge, MA, USA.
McGill University, Montreal, Canada.
National Institute for Scientific Computing (LNCC), Petropolis, Rio de Janeiro, Brazil.
Radboud University, Nijmegen, Netherlands.
Faculty of Bioengineering and Bioinformatics, Moscow State University, Leninskie Gory 1-73, 119992 Moscow, Russia.
North Carolina State University, Raleigh, NC, USA.
New York Genome Center, New York, NY, USA.
Department of Systems Biology, Columbia University, New York, NY, USA.
Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA.
Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Barcelona, Catalonia, Spain.
Joint CRG-Barcelona Super Computing Center (BSC)–Institut de Recerca Biomedica (IRB) Program in Computational Biology, Barcelona, Catalonia, Spain.
Transcriptional regulation and posttranscriptional processing underlie many cellular and organismal phenotypes. We used RNA sequence data generated by Genotype-Tissue Expression (GTEx) project to investigate the patterns of transcriptome variation across individuals and tissues. Tissues exhibit characteristic transcriptional signatures that show stability in postmortem samples. These signatures are dominated by a relatively small number of genes—which is most clearly seen in blood—though few are exclusive to a particular tissue and vary more across tissues than individuals. Genes exhibiting high interindividual expression variation include disease candidates associated with sex, ethnicity, and age. Primary transcription is the major driver of cellular specificity, with splicing playing mostly a complementary role; except for the brain, which exhibits a more divergent splicing program. Variation in splicing, despite its stochasticity, may play in contrast a comparatively greater role in defining individual phenotypes.
The Genotype-Tissue Expression Project (GTEx) is developing such a resource, collecting multiple “nondiseased” tissues sampled from recently deceased human donors. We analyzed the GTEx pilot data freeze, which comprised RNA sequencing (RNA-seq) from 1641 samples from 175 individuals representing 43 sites: 29 solid organ tissues, 11 brain subregions, whole blood, and two cell lines: Epstein-Barr virus–transformed lymphocytes (LCL) and cultured fibroblasts from skin
Brain subregions are not well differentiated, with the exception of cerebellum. Postmortem ischemia appears to have little impact on the characteristic tissue transcriptional signatures, as previously noted. In a comparison of 798 GTEx samples with 609 “nondiseased” samples obtained from living (surgical) donors, we found that GTEx samples clustered with surgical samples of the same tissue type.
Tissue transcription is generally dominated by the expression of a relatively small number of genes. Indeed, we found that for most tissues, about 50% of the transcription is accounted for by a few hundred genes. In many tissues, the bulk of transcription is of mitochondrial origin. In kidney, for instance, a highly aerobic tissue with many mitochondria, a median of 51% (>65% in some samples) of the transcriptional output is from the mitochondria. Other tissues show nuclear-dominated expression; in blood, for example, three hemoglobin genes contribute more than 60% to total transcription. Genes related to lipid metabolism in pancreas, actin in muscle, and thyroglobulin in thyroid are other examples of nuclear genes contributing disproportionally to tissue-specific transcription. Because RNA samples are generally sequenced to the same depth, in tissues where a few genes dominate expression, fewer RNA-seq reads are comparatively available to estimate the expression of the remaining genes, decreasing the power to estimate expression variation. These tissues—i.e., blood, muscle, and heart -- are, consequently, those with less power to detect eQTLs. Because most eQTL analyses are performed on easily accessible samples, such as blood, this highlights the relevance of the GTEx multitissue approach.
Although thousands of genes are differentially expressed between tissues or show tissue-preferential expression, fewer than 200 genes are expressed exclusively in a given tissue. The vast majority (~ 95%) are exclusive to testis and many are lncRNAs. This may reflect low-level basal transcription common to all cell types or result from general tissue heterogeneity, with few primary cell types being specific to a given tissue.
Expression of repetitive elements also recapitulates tissue type. We identified 3046 PCGs whose expression, in at least one tissue, was correlated with the expression of the closest repeat element (on average 2827 base pairs away). In about half of these cases, the repeat was also significantly coexpressed with other repeats of its same family. LncRNA expression can be regulated by specific repeat families and we found evidence that testis-specific expression could be regulated by endogenous retrovirus.
Finally, we detected 1993 genes that globally change expression with age (FDR < 0.05). Genes that decrease expression are enriched in functions and pathways related to neurodegenerative diseases such as Parkinson’s and Alzheimer’s diseases, among which eight harbor single-nucleotide polymorphisms (SNPs) for these diseases identified from genome-wide association studies (P < 0.05). Among the genes that increase expression with age is EDA2R, whose ligand, EDA, has been associated with age-related phenotypes.
Return to — Autism Spectrum Disorder