Scientific Understanding of Consciousness
Genetic Variants and Functional Effects in Human Traits
Nature 501,506–511(26 September 2013)
Transcriptome and genome sequencing uncovers functional variation in humans
Tuuli Lappalainen, et.al.
Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
Institute for Genetics and Genomics in Geneva (iG3), University of Geneva, 1211 Geneva, Switzerland
Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland
Centro Nacional de Análisis Genómico, 08028 Barcelona, Catalonia, Spain
Centre for Genomic Regulation (CRG), 08003 Barcelona, Catalonia, Spain
Pompeu Fabra University (UPF), 08003 Barcelona, Catalonia, Spain
CRG Hospital del Mar Research Institute, 08003 Barcelona, Catalonia, Spain
CRG CIBERESP, 08003 Barcelona, Catalonia, Spain
Department of Human Genetics, Leiden University Medical Center, 2300 RC Leiden, the Netherlands
Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
Institute of Clinical Molecular Biology, Christian-Albrechts-University Kiel, D-24105 Kiel, Germany
Institute of Human Genetics, Helmholtz Zentrum München, 85764 Neuherberg, Germany
Department of Medical Sciences, Molecular Medicine and Science for Life Laboratory, Uppsala University, 751 85 Uppsala, Sweden
Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA
Leiden Genome Technology Center, 2300 RC Leiden, the Netherlands
Oxford Centre for Diabetes Endocrinology and Metabolism, University of Oxford, Oxford OX3 7BN, UK
Institute of Human Genetics, Technische Universität München, 81675 Munich, Germany
Dahlem Centre for Genome Research and Medical Systems Biology, 14195 Berlin, Germany
Fundacion Publica Galega de Medicina Xenomica (SERGAS), Genomic Medicine Group, CIBERER, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Deutsches Forschungszentrum für Herz-Kreislauferkrankungen (DZHK), Partner Site Munich Heart Alliance, 81675 Munich, Germany
Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project—the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.
Interpreting functional consequences of millions of discovered genetic variants is one of the biggest challenges in human genomics. Although genome-wide association studies (GWAS) have linked genetic loci to various human phenotypes and the functional annotation of the genome is improving, we still have a limited understanding of the underlying causal variants and biological mechanisms. One approach to addressing this challenge has been to analyse variants affecting cellular phenotypes, such as gene expression, known to affect many human diseases and traits.
In this study, we characterize functional variation in human genomes by RNA-sequencing hundreds of samples from the 1000 Genomes Project1, the most important reference data set of human genetic variation, thus creating the biggest RNA sequencing data set of multiple human populations so far. We not only catalogue novel loci with regulatory variation, but also, for the first time, discover and characterize molecular properties of causal functional variants.
We performed mRNA and small RNA sequencing on lymphoblastoid cell line samples from five populations: the CEPH (CEU), Finns (FIN), British (GBR), Toscani (TSI) and Yoruba (YRI). After quality control, we had 462 and 452 individuals (89–95 per population) with mRNA and miRNA data, respectively. Of these, 421 are in the 1000 Genomes Phase 1 data set1, and the remainder were imputed from single nucleotide polymorphism (SNP) array data. High-throughput RNA sequencing (RNA-seq) was performed in seven laboratories, and the smaller amount of variation between laboratories than individuals demonstrated that RNA sequencing is a mature technology ready for distributed data production (Mann-Whitney P < 2.2 × 10−6 for mRNA, P = 1.34 × 10−10 for miRNA). To discover genetic regulatory variants, we mapped cis- quantitative trait loci (QTLs) to transcriptome traits of protein-coding and miRNA genes separately in the European (EUR) and Yoruba (YRI) populations. The RNA-seq read, quantification, genotype and QTL data are available open-access.
Transcriptome Variation in Populations
This first uniformly processed RNA-seq data set from multiple human populations allowed high-resolution analysis of transcriptome variation. Individual and population differences in transcripts can manifest in (1) overall expression levels, and (2) relative abundance of transcripts from the same gene (transcript ratios). Deconvolution of the relative contribution of these indicates that this ratio is characteristic for each gene, with transcript ratio being on average more dominant. Population differences explain a small but significant proportion of 3% of the total variation (Mann-Whitney P < 2.2 × 10−16). In addition to this genome-wide perspective to population variation, we identified 263–4,379 genes with differential expression and/or transcript ratios between population pairs. Notably, continental differences between YRI–EUR population pairs have a much higher contribution of genes with different transcript usage than European population pairs (75–85% versus 6–40%). This has not been observed before in humans, but it is consistent with splicing patterns capturing phylogenetic differences between species better than expression levels.
We quantify a total of 644 autosomal miRNAs in >50% individuals, of which 60 have significant cis-eQTLs for miRNA expression levels (cis-mirQTLs) showing that genetic effects on miRNA expression are much more widespread than the previously identified loci. To complement previous studies of miRNA function in cell perturbation experiments, we analysed miRNA–mRNA interactions in our steady-state population sample. Of 100 miRNA families, 32 correlated with the expression of predicted target exons in a highly connected network (P < 0.001), including miRNA families with important immunological or lymphocyte functions, such as miR-150, miR-155, miR-181 and miR-146. Interestingly, 45% of the associations were positive—consistent with previous results—even though based on perturbation experiments miRNAs mostly downregulate genes. Analysing the direction of causality, cis-mirQTLs had small trans-eQTL effects to predicted targets only when effects were negative ( = 1 – Storey’s = 0.11 versus = 0), suggesting that miRNAs indeed downregulate their targets. Positive correlations may be driven by other effects, which is supported by overrepresentation of transcription factors in the network (29%, Fisher P = 2.1 × 10−7 for negative targets and 26% P = 4.0 × 10−4 for positive targets). This suggests feedback loops of both mRNA and miRNA genes affecting the expression of each other, and supports the idea that under steady-state conditions, miRNAs confer robustness to expression programs. Altogether, these results highlight the added insight into the role of miRNAs in regulatory networks from analysis of population variation.
[end of paraphrase]
Return to — Embryonic Development of Brain