Scientific Understanding of Consciousness
Consciousness as an Emergent Property of Thalamocortical Activity

Protein-Truncating Variants tabulated by GTEx Project


Science 8 May 2015:  Vol. 348  no. 6235  pp. 666-669

Effect of predicted protein-truncating genetic variants on the human transcriptome

Manuel A. Rivas,

Wellcome Trust Centre for Human Genetics, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK.

FInstitute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland.

Washington University in St. Louis, St. Louis, MO, USA.

Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Analytical and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.

Department of Genetics, Stanford University, Stanford, CA, USA.

Department of Pathology, Stanford University, Stanford, CA, USA.

Biomedical Informatics Program, Stanford University, Stanford, CA, USA.

Department of Psychiatry, Mt. Sinai Hospital, NY, USA.

Department of Genetic Medicine and Development,University of Geneva, Geneva, Switzerland.

Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, Switzerland.

Swiss Institute of Bioinformatics, Geneva, Switzerland.

Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY, USA.

Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, UK.

Center for Genomic Regulation (CRG), Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain.

Department of Statistics, University of Oxford, Oxford, UK.

National Institute for Scientific Computing (LNCC), Petropolis, Rio de Janeiro, Brazil.

Oxford Center for Diabetes Endocrinology and Metabolism, University of Oxford, Oxford, UK.

New York Genome Center, New York, NY, USA.

Department of Systems Biology, Columbia University, New York, NY, USA.

Department of Medicine, Harvard Medical School, Boston, MA, USA.


Accurate prediction of the functional effect of genetic variation is critical for clinical genome interpretation. We systematically characterized the transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects. We quantitated tissue-specific and positional effects on nonsense-mediated transcript decay and present an improved predictive model for this decay. We directly measured the effect of variants both proximal and distal to splice junctions. Furthermore, we found that robustness to heterozygous gene inactivation is not due to dosage compensation. Our results illustrate the value of transcriptome data in the functional interpretation of genetic variants.

Genetic variants predicted to shorten the coding sequence of genes—termed protein-truncating variants (PTVs) — are typically expected to have large effects on gene function. These variants are enriched for disease-causing mutations, but some may be protective against disease. However, PTVs are abundant in the genomes of healthy individuals, indicating that they often do not have major phenotypic consequences. In addition, although PTVs are often described as loss-of-function (LOF) variants, in most cases their precise molecular effect has not been characterized and in other cases show gain-of-function effects. Clinical interpretation of PTVs will thus require direct characterization of their biochemical effects.

We cataloged predicted PTVs and their transcriptomic effect in 462 healthy individuals with DNA and mRNA sequencing (RNA-seq) from lymphoblastoid cell lines (LCLs) in the Geuvadis study and 173 individuals with exome sequencing and RNA-seq from a total of 1634 samples from multiple tissues in the Genotype-Tissue Expression (GTEx) study. Each GTEx individual has RNA-seq data from 1 to 30 tissues, with 9 tissues having >80 samples. We defined PTVs as single-nucleotide variants (SNVs) predicted to introduce a premature stop codon or to disrupt a splice site, small insertions or deletions (indels) predicted to disrupt a transcript’s reading frame, and larger deletions that remove the full protein coding sequence (CDS). We identified 13,182 candidate PTVs using phase 1 data of the 1000 Genomes Project of the 421 individuals included in the Geuvadis RNA-seq project, as well as 4584 candidate PTVs in the GTEx data, for a combined total of 16,286 candidate variants

We measured total gene expression levels in reads per kilobase of exon per million mapped reads, allele-specific expression (ASE) detecting different expression levels of two haplotypes of an individual, and split mappings across annotated exon junctions to quantify splicing. Transcripts containing common PTVs are more weakly expressed and more tissue-specific than transcripts that do not contain common PTVs. consistent with previous work.

PTVs that generate premature stop codons may trigger nonsense-mediated decay (NMD). Such variants are often recessive and may protect against detrimental phenotypic effects but also may cause disease via haploinsufficiency. Variants that escape NMD may create a truncated protein with dominant-negative or gain-of-function effects.

Allelic count data were analyzed with a Bayesian statistical method to address whether a variant exhibits ASE in a given tissue and whether this signal is shared across multiple tissues of the same individual. We observe a higher proportion of strong or moderate allelic imbalance in rare and singleton nonsense SNVs compared with common nonsense variants (54.3%, 55.4%, and 35.7%, respectively), suggesting that rare PTVs are more likely to trigger NMD

We examined whether heterozygous carriers of PTVs exhibit compensatory up-regulation of the functional allele, which could contribute to tolerance of PTVs and partially explain the widespread haplosufficiency of human genes.

Disruption of splicing can result in changes in protein structure either via in-frame changes in exon structure or by introducing a premature stop codon. Splicing variant annotation tools typically focus only on the two bases at either end of a spliced intron, “essential splice sites,” despite the fact that more distant sites are also known to affect splicing.

In the Geuvadis data set, up to 79% of variants in the four essential splice-site loci cause splice disruptions (P < 0.01).

By drawing on data from a wide range of adult tissues across 635 individuals, we provide a systematic assessment of the effect of predicted PTVs on the human transcriptome. Furthermore, this study indicates that nonsense-mediated decay has heterogeneous effects across tissues and also shows how to better detect splice-disrupting variants outside the “essential” sites at the splice junction.

We find no evidence for widespread dosage compensation maintaining normal expression levels of genes affected by heterozygous PTVs. This, together with the fact that most human genes are haplosufficient, suggests that homeostatic mechanisms at the cellular level, possibly as proposed in the theory of dominance, maintain biological function in the face of heterozygous, or even homozygous, inactivation of human genes.

The resource made available with this study provides a starting point for cataloging variants affecting protein function, but larger data sets will be required to increase our power to predict molecular consequences of variants from sequence data alone. These results highlight the benefits of direct RNA sequencing of either patient tissue or genetically engineered cell lines for interpretation of genetic variation and suggest that personal transcriptomics will become an important complement to genome analysis.



Return to —  Autism Spectrum Disorder