Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Translation of neutrally evolving peptides provides a basis for de novo gene evolution

Abstract

Accumulating evidence indicates that some protein-coding genes have originated de novo from previously non-coding genomic sequences. However, the processes underlying de novo gene birth are still enigmatic. In particular, the appearance of a new functional protein seems highly improbable unless there is already a pool of neutrally evolving peptides that are translated at significant levels and that can at some point acquire new functions. Here, we use deep ribosome-profiling sequencing data, together with proteomics and single nucleotide polymorphism information, to search for these peptides. We find hundreds of open reading frames that are translated and that show no evolutionary conservation or selective constraints. These data suggest that the translation of these neutrally evolving peptides may be facilitated by the chance occurrence of open reading frames with a favourable codon composition. We conclude that the pervasive translation of the transcriptome provides plenty of material for the evolution of new functional proteins.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Detection of translated ORFs.
Fig. 2: Identification of selection signatures.
Fig. 3: Three-nucleotide periodicity of translated ORFs.
Fig. 4: Factors influencing the translation of neutrally evolving ORFs.

Similar content being viewed by others

References

  1. Kutter, C. et al. Rapid turnover of long noncoding RNAs and the evolution of gene expression. PLoS Genet. 8, e1002841 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Wiberg, R. A. W. et al. Assessing recent selection and functionality at long noncoding RNA loci in the mouse genome. Genome Biol. Evol. 7, 2432–2444 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Ruiz-Orera, J. et al. Origins of de novo genes in human and chimpanzee. PLoS Genet. 11, e1005721 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Ji, Z., Song, R., Regev, A. & Struhl, K. Many lncRNAs, 5'UTRs, and pseudogenes are translated and some are likely to express functional proteins. Elife 4, e08890 (2015).

    PubMed  PubMed Central  Google Scholar 

  5. Raj, A. et al. Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling. Elife 5, e13328 (2016).

    PubMed  PubMed Central  Google Scholar 

  6. Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789–802 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Ingolia, N. T. et al. Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Rep. 8, 1365–1379 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Ruiz-Orera, J., Messeguer, X., Subirana, J. A. & Alba, M. M. Long non-coding RNAs as a source of new peptides. Elife 3, e03523 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Wilson, B. A. & Masel, J. Putatively noncoding transcripts show extensive association with ribosomes. Genome Biol. Evol. 3, 1245–1252 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Couso, J.-P. & Patraquim, P. Classification and function of small open reading frames. Nat. Rev. Mol. Cell Biol. 18, 575–589 (2017).

    Article  CAS  PubMed  Google Scholar 

  11. Ingolia, N. T., Ghaemmaghami, S., Newman, J. R. S. & Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Bazzini, A. A. et al. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J. 33, 981–993 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Calviello, L. et al. Detecting actively translated open reading frames in ribosome profiling data. Nat. Methods 13, 165–170 (2016).

    Article  CAS  PubMed  Google Scholar 

  14. Aspden, J. L. et al. Extensive translation of small ORFs revealed by Poly-Ribo-Seq. Elife 3, e03528 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Mackowiak, S. D. et al. Extensive identification and analysis of conserved small ORFs in animals. Genome Biol. 16, 1–21 (2015).

    Article  Google Scholar 

  16. Begun, D. J., Lindfors, H. A., Kern, A. D. & Jones, C. D. Evidence for de novo evolution of testis-expressed genes in the Drosophila yakuba/Drosophila erecta clade. Genetics 176, 1131–1137 (2006).

    Article  Google Scholar 

  17. Tautz, D. & Domazet-Lošo, T. The evolutionary origin of orphan genes. Nat. Rev. Genet. 12, 692–702 (2011).

    Article  CAS  PubMed  Google Scholar 

  18. McLysaght, A. & Hurst, L. D. Open questions in the study of de novo genes: what, how and why. Nat. Rev. Genet. 17, 567–578 (2016).

    Article  CAS  PubMed  Google Scholar 

  19. Zhao, L., Saelao, P., Jones, C. D. & Begun, D. J. Origin and spread of de novo genes in Drosophila melanogaster populations. Science 343, 769–772 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Carvunis, A.-R. et al. Proto-genes and de novo gene birth. Nature 487, 370–374 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Toll-Riera, M. et al. Origin of primate orphan genes: a comparative genomics approach. Mol. Biol. Evol. 26, 603–612 (2009).

    Article  CAS  PubMed  Google Scholar 

  22. Cai, J. J. & Petrov, D. A. Relaxed purifying selection and possibly high rate of adaptation in primate lineage-specific genes. Genome Biol. Evol. 2, 393–409 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Chen, S., Zhang, Y. E. & Long, M. New genes in Drosophila quickly become essential. Science 330, 1682–1685 (2010).

    Article  CAS  PubMed  Google Scholar 

  24. Reinhardt, J. A. et al. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences. PLoS Genet. 9, e1003860 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Sunyaev, S., Kondrashov, F. A., Bork, P. & Ramensky, V. Impact of selection, mutation rate and genetic drift on human genetic variation. Hum. Mol. Genet. 12, 3325–3330 (2003).

    Article  CAS  PubMed  Google Scholar 

  26. Gayà-Vidal, M. & Albà, M. M. Uncovering adaptive evolution in the human lineage. BMC Genomics 15, 599 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Harr, B. et al. Genomic resources for wild populations of the house mouse, Mus musculus and its close relative Mus spretus. Sci. Data 3, 160075 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Buck-Koehntop, B. A., Mascioni, A., Buffy, J. J. & Veglia, G. Structure, dynamics, and membrane topology of stannin: a mediator of neuronal cell apoptosis induced by trimethyltin chloride. J. Mol. Biol. 354, 652–665 (2005).

    Article  CAS  PubMed  Google Scholar 

  29. Pueyo, J. I. et al. Hemotin, a regulator of phagocytosis encoded by a small ORF and conserved across Metazoans. PLoS Biol. 14, e1002395 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Plotkin, J. B. & Kudla, G. Synonymous but not the same: the causes and consequences of codon bias. Nat. Rev. Genet. 12, 32–42 (2011).

    Article  CAS  PubMed  Google Scholar 

  31. Vizcaíno, J. A. et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res. 44, D447–D456 (2016).

    Article  PubMed  Google Scholar 

  32. Slavoff, S. A. et al. Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat. Chem. Biol. 9, 59–64 (2013).

    Article  CAS  PubMed  Google Scholar 

  33. Heinen, T. J. A. J., Staubach, F., Häming, D. & Tautz, D. Emergence of a new gene from an intergenic region. Curr. Biol. 19, 1527–1531 (2009).

    Article  CAS  PubMed  Google Scholar 

  34. Dana, A. & Tuller, T. The effect of tRNA levels on decoding times of mRNA codons. Nucleic Acids Res. 42, 9171–9181 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Yu, C. et al. Codon usage influences the local rate of translation elongation to regulate co-translational protein folding. Mol. Cell 59, 744–754 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Presnyak, V. et al. Codon optimality is a major determinant of mRNA stability. Cell 160, 1111–1124 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Schlötterer, C. Genes from scratch — the evolutionary fate of de novo genes. Trends Genet. 31, 215–219 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Okazaki, Y. et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420, 563–573 (2002).

    Article  PubMed  Google Scholar 

  39. Neme, R. & Tautz, D. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. Elife 5, e09977 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Lynch, M. & Marinov, G. K. The bioenergetic costs of a gene. Proc. Natl Acad. Sci. USA 112, 15690–15695 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Wilson, B. A., Foy, S. G., Neme, R. & Masel, J. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nat. Ecol. Evol. 1, 146 (2017).

    Article  Google Scholar 

  42. Kaiser, C. A., Preuss, D., Grisafi, P. & Botstein, D. Many random sequences functionally replace the secretion signal sequence of yeast invertase. Science 235, 312–317 (1987).

    Article  CAS  PubMed  Google Scholar 

  43. Keefe, A. D. & Szostak, J. W. Functional proteins from a random-sequence library. Nature 410, 715–718 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Neme, R., Amador, C., Yildirim, B., McConnell, E. & Tautz, D. Random sequences are an abundant source of bioactive RNAs or peptides. Nat. Ecol. Evol. 1, 0127 (2017).

    Article  Google Scholar 

  45. Soumillon, M. et al. Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. Cell Rep. 3, 2179–2190 (2013).

    Article  CAS  PubMed  Google Scholar 

  46. Necsulea, A. et al. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature 505, 635–640 (2014).

    Article  CAS  PubMed  Google Scholar 

  47. Smeds, L. & Künstner, A. ConDeTri — a content dependent read trimmer for Illumina data. PLoS ONE 6, e26314 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Luis Villanueva-Cañas, J. et al. New genes and functional innovation in mammals. Genome Biol. Evol. 9, 1886–1900 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Gonzalez, C. et al. Ribosome profiling reveals a cell-type-specific translational landscape in brain tumors. J. Neurosci. 34, 10924–10936 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  52. Castañeda, J. et al. Reduced pachytene piRNAs and translation underlie spermiogenic arrest in Maelstrom mutant mice. EMBO J. 33, 1999–2019 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Guo, H., Ingolia, N. T., Weissman, J. S. & Bartel, D. P. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466, 835–840 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Diaz-Munoz, M. D. et al. The RNA-binding protein HuR is essential for the B cell antibody response. Nat. Immunol. 16, 415–425 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Cho, J. et al. Multiple repressive mechanisms in the hippocampus during memory formation. Science 350, 82–87 (2015).

    Article  CAS  PubMed  Google Scholar 

  56. Sedlazeck, F. J., Rescheneder, P. & von Haeseler, A. NextGenMap: fast and accurate read mapping in highly polymorphic genomes. Bioinformatics 29, 2790–2791 (2013).

    Article  CAS  PubMed  Google Scholar 

  57. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Karolchik, D. et al. The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 42, D764–D770 (2014).

    Article  CAS  PubMed  Google Scholar 

  59. Rosenberg, M. S., Subramanian, S. & Kumar, S. Patterns of transitional mutation biases within and among mammalian genomes. Mol. Biol. Evol. 20, 988–993 (2003).

    Article  CAS  PubMed  Google Scholar 

  60. R Development Core Team R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2016).

Download references

Acknowledgements

We are grateful for valuable discussions with many colleagues during this study. This work was funded by grants BFU2012-36820, BFU2015-65235-P and TIN2015-69175-C4-3-R from Ministerio de Economía e Innovación (Spanish Government) and co-funded by FEDER (EC). We also received funding from Agència de Gestió d’Ajuts Universitaris i de Recerca Generalitat de Catalunya (AGAUR), grant no. 2014SGR1121.

Author information

Authors and Affiliations

Authors

Contributions

J.R.-O. and M.M.A. conceived the study, interpreted the data and wrote the paper. J.R.-O. performed most of the analyses, including the transcript assemblies, identification of translated ORFs, BLAST searches, SNP mapping and generation of controls. J.R.-O., P.V.-G. and J.L.V.-C. wrote the code and performed analyses on the coding score. X.M. wrote the code to calculate the expected SNP frequencies. M.M.A. coordinated the study.

Corresponding authors

Correspondence to Jorge Ruiz-Orera or M. Mar Albà.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Tables 1–6, Supplementary Figures 1–10

Life Sciences Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ruiz-Orera, J., Verdaguer-Grau, P., Villanueva-Cañas, J.L. et al. Translation of neutrally evolving peptides provides a basis for de novo gene evolution. Nat Ecol Evol 2, 890–896 (2018). https://doi.org/10.1038/s41559-018-0506-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41559-018-0506-6

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing