In 2004, oncologist Gideon Rechavi at Tel Aviv University in Israel and his colleagues compared all the human genomic DNA sequences then available with their corresponding messenger RNAs — the molecules that carry the information needed to make a protein from a gene. They were looking for signs that one of the nucleotide building blocks in the RNA sequence, called adenosine (A), had changed to another building block called inosine (I). This 'A-to-I editing' can alter a protein's coding sequence, and, in humans, is crucial for keeping the innate immune response in check. “It sounds simple, but in real life it was really complicated,” Rechavi recalls. “Several groups had tried it before and failed” because sequencing mistakes and single-nucleotide mutations had made the data noisy. But using a new bioinformatics approach, his team uncovered thousands of sites in the transcriptome — the complete set of mRNAs found in an organism or cell population — and later studies upped the number into the millions1.

A molecular model of a bacterial ribosome bound to messenger RNA, a complex that is formed during protein synthesis. Credit: Laguna Design/SPL

Inosine is something of a special case: researchers can readily detect this chink in the armour by comparing DNA and RNA sequences. But at least one-quarter of our mRNAs harbour chemical tags — decorations to the A, C, G and U nucleotides — that are invisible to today's sequencing technologies. (Similar chemical tags, called epigenetic markers, are also found on DNA.) Researchers aren't sure what these chemical changes in RNA do, but they're trying to find out.

A wave of studies over the past five years — many of which focus on a specific RNA mark called N6-methyladenosine (m6A) — have mapped these alterations across transcriptomes and demonstrated their importance to health and disease. But the problem is vast: these marks coat not only mRNA but other RNA transcripts as well, and they cut across all the domains of life and beyond, marking even viruses with their presence.

A methylated RNA duplex (adenine N6 methyl groups in major groove highlighted in light blue). Credit: ref. 18/ACS

The modifications themselves are not new. What has given them meaning and driven epitranscriptomics into the spotlight is the discovery of enzymes that can add, remove and interpret them. In 2010, chemical biologist Chuan He at the University of Chicago, Illinois, proposed that these chemical tags could be reversible and important regulators of gene expression. Not long afterwards, his group demonstrated2 the first eraser of these marks on mRNA, an enzyme called FTO. That discovery meant that m6A wasn't just a passive mark — cells actively controlled it. And this realization came at about the same time that global approaches, harnessing the power of next-generation sequencing, made it possible to map m6A and other modifications across the transcriptome.

Today, epitranscriptomics is blossoming. Yet its toolbox remains a work in progress. Current methods lack the sensitivity required for use with rare and precious samples. It's also not possible to quantify the amount of a given modification in the transcriptome, nor to map more than one modification in a single experiment. “There's an urgent and high demand for additional technology developments for all kinds of RNA modifications,” says molecular biologist Tao Pan at the University of Chicago, who collaborated on He's FTO studies.

That said, epitranscriptomics researchers are excited about the direction their field is taking. “Just as you wouldn't think of DNA without thinking about how DNA is packaged, or epigenetically modified,” says geneticist Chris Mason at Weill Cornell Medical College in New York City, who has led m6A-mapping efforts, “I think now and in the future, no one will think of RNA without thinking 'How is it modified?'”

Mapping with antibodies

In the early 1970s, scientists first showed that mRNA was chemically modified by using radioisotope labelling of m6A. But because those studies enriched the mRNA transcripts by selecting their 3′ ends, which contain strings of adenosines, researchers worried that those preparations might contain trace amounts of other classes of RNA molecules, as well. “People stopped working on this because it was so difficult to get clear insights into whether the m6A in mRNA was a contaminant,” says Samie Jaffrey, a chemical biologist at Weill Cornell Medical College.

Also difficult was working out where in the transcriptome m6A was located, which could provide clues to its function. Conventional sequencing approaches involve reverse transcription — converting RNA into complementary DNA (cDNA), which is then amplified and sequenced. The problem is that the reverse transcriptase enzyme used to make cDNA erases the modifications. “There was no way to see m6A,” Jaffrey says. “When you reverse-transcribe it, it behaves exactly like an A.”

There's an urgent demand for additional technology developments for all kinds of RNA modifications.

Despite the technical challenges, the discovery of unexpected bacterial RNA modifications3 piqued Jaffrey's interest, and he decided to look for them in mammalian RNA. Working with Mason, his team sheared RNA into tiny pieces, pulled out those that contained m6A using antibodies, and sequenced the RNAs4. “We were clearly seeing labelling of mRNAs and that was remarkable. It was not a contaminant,” Jaffrey says. A similar study5 by Rechavi's group unearthed a hilly landscape of m6A peaks, roughly 12,000 sites in 7,000 human genes. The modifications, Rechavi's team discovered, tended to be concentrated on the protein-coding sequences called exons and on stop codons, the three-letter codes in mRNA that signify the end of the protein-coding sequence.

The methods these researchers used, m6A-seq and MeRIP-seq, have since been broadly used to map m6A in different disease contexts and organisms. Antibodies and reagents targeting m6A are available from Active Motif (go.nature.com/2kqgzu8), MilliporeSigma (go.nature.com/2kw39m3) and New England BioLabs (go.nature.com/2kqjjaz), among others. Researchers think that the modification could control the way cells develop into different types, a process that goes awry in cancer. Indeed, the first links between the epitranscriptome and cancer have already emerged. He's group, for example, showed that in some forms of acute myeloid leukaemia, FTO is present in higher-than-normal levels and seems to remove m6A from certain transcripts6, which could spur cells to differentiate.

Cancer biologist Howard Chang. Credit: HHMI

A parallel line of research has turned that finding on its head. Using an antibody-mapping method called miCLIP, which is higher in resolution than its predecessors, Jaffrey's team showed that its m6A antibodies also bind to N6, 2′-O-dimethyladenosine (m6Am), a modification of the chemical structures that cap the 5′ end of mRNAs. At the time, Jaffrey didn't know if m6Am carried any biological meaning. But his team has since shown that m6Am (and not m6A) is in fact the major target of the FTO eraser, and that it affects the stability and subcellular location of mRNAs7. To Jaffrey, that suggests that He's findings linking FTO to acute myeloid leukaemia mean that m6Am, not m6A, is now implicated in the origin and development of cancer.

Such discrepancies are par for the course in an emerging field, says cancer biologist Howard Chang at Stanford University, California. “This particular issue is not that different from the early days of the histone-modification field,” he says, referring to the study of chemical alterations to the histone 'spools' around which DNA is tightly packaged in cells.

Many modifications

Other RNA modifications have also attracted researchers' attention. In 2016, teams led by chemist Chengqi Yi at Peking University in Beijing and by Rechavi and He used antibody-based methods to map N1-methyladenosine (m1A) in mouse and human cell lines and tissues8,9. Using different approaches to prevent m1A from interfering with reverse transcription, the two teams showed that m1A, which was discovered in total RNA in the early 1960s, is present on mRNA at the position at which the translation machinery initiates protein production. Stress conditions alter the maps, suggesting that they are dynamic.

The researchers don't yet know what m1A does, but they have a tantalizing clue: most transcripts have only one m1A site, and these seem to be translated more often than those that lack the modification. “This is very exciting — and of course challenging — because we are dealing with a new regulatory mechanism for translation of messenger RNA,” Rechavi says. An antibody that targets m1A is available from MBL International (go.nature.com/2kvqpfs).

Other global mapping strategies rely on the fact that some RNA modifications can serve as handles for attaching chemical tags. When Yi started his lab in late 2011, modified RNA building blocks called pseudouridines or 'pseudoUs' were well known in other classes of RNA but had not been seen on mRNA. In 2015, his group described a chemical labelling and pulldown method for enriching the modification on transcripts10. To Yi's surprise, pseudoUs are much more abundant in human and mouse mRNA than was previously thought. Now his focus is on finding the mark's function. “I think pseudoU can have multiple functions in mRNA, depending on when, where and how it is installed and regulated on RNA transcripts,” Yi says. Some pseudoU 'writers' are already known, he adds, but whether there also are readers or erasers is an open question.

Moving beyond mapping

Whether using antibodies or chemical approaches, mapping RNA modifications is a tricky business. Antibodies can cross-react with other modifications, so researchers should use at least two different antibodies and cross-correlate the hits, Mason says. Chemical methods can cut, bind or enrich some areas more frequently than others, producing biased fragmentation patterns. Sequencing depth and choice of bioinformatics algorithms can affect detection of modification sites, Yi says. Even the length of time for which cells are kept alive in culture could influence modification levels, so it is crucial (and not necessarily trivial) to capture baseline maps for comparison, Mason says.

But in any event, it's not enough to show that a sample contains a particular modification. Instead, it will be crucial to quantify all RNA modifications, because cells probably rely on just the right amount of a given modification for their proper function, Pan says. Quantification is particularly important for those researchers who want to tune levels of modifications by boosting the enzymes that write and erase them. The mere presence of such enzymes suggests that precisely tuned levels of modification are important, Pan says.

In 2015, Pan's team described a potential way to quantify modifications, at least in another class of RNA called transfer RNAs11. The approach uses a reverse transcriptase that can read through the modifications efficiently, thereby capturing sequences downstream of the first modification it encounters. Now Pan says his group is working to apply the same strategy to mRNA modifications such as m1A.

But perhaps the fastest way to define function is to identify the readers, writers and erasers of these tags. In their 2012 m6A-mapping study4, Rechavi's group created short stretches of RNA with and without m6A modifications, using these fragments as bait to pull out any proteins that might be bound to the RNA. In 2014, He's group discovered several m6A readers using a comparable strategy12. Other studies have implicated other cellular functions as well. Now Rechavi plans to try a similar baiting approach to pull out readers of m1A, although that may prove more difficult because the site at which m1A concentrates (where protein translation begins) is more highly structured than are the sequences that contain m6A.

It wouldn't surprise me if some of the effects happen through other species of RNAs that aren't on people's radars

Once enzyme readers are identified for a particular chemical tag, gene-editing technologies should make it easy to tune their expression, allowing researchers to glean some insight from global changes to a modification. For instance, Chang's work deleting a writer of m6A showed the importance of this modification in determining cell fate13.

But as with all things epigenetic, interpreting such findings may also prove complicated, as the same enzyme may well work across broad swathes of RNA species, and a given modification can have different functions in each type of RNA, says Chang. “It wouldn't surprise me if some of the effects happen through other species of RNAs that aren't on people's radars but they're still prevalent in the cell and very important,” he says.

Enhancing the toolbox

In the past few years, He's group has discovered evidence14 suggesting that RNA modifications provide a way to regulate transcripts involved in broad cellular roles, such as switching on cell-differentiation programs. Researchers need better technologies to explore these links; and, in October 2016, the US National Institutes of Health awarded He and Pan a 5-year, US$10.6-million grant to establish a centre to develop methods for identifying and mapping RNA modifications. One big focus is to come up with a way to generate mutations at the sites of a modification, and to amplify those, He says.

Electron micrograph showing characteristic attachment of RNA polymerase molecules to DNA strands. Credit: Omikron/SPL

With new imaging techniques, it might eventually be possible to resolve single marks on a given transcript for visual inspection. “I'm dying to say that someone developed a technology to image the m6A modification in mRNA,” He says, but at the moment, that is not the case. Ye Fu, a former graduate student in his lab, is exploring this approach in biophysicist Xiaowei Zhuang's lab at Harvard University in Cambridge, Massachusetts. Fu is combining super-resolution microscopy with another method Zhuang has pioneered for visualizing RNA in single cells, called MERFISH (multiplexed error-robust fluorescence in situ hybridization). Fu says that he has made progress in the past two years, but the data are noisy and need to be optimized for these modifications to be detected efficiently.

Others are working to circumvent the problems associated with conventional sequencing by sequencing RNA directly. Scientists at Oxford Nanopore Technologies in the United Kingdom reported one such method15, which extends to RNA the company's capability to thread DNA and other polymers through a nanopore embedded in a membrane. Researchers at Pacific Biosciences in Menlo Park, California, have also demonstrated direct RNA sequencing using the company's single-molecule real-time (SMRT) sequencing chemistry16. “The idea is certainly as old as SMRT sequencing itself,” says Jonas Korlach, chief scientific officer. SMRT sequencing uses an enzyme called DNA polymerase to replicate strands of DNA, capturing the addition of fluorescent nucleotides in real time. To adapt the technology to RNA, Korlach, Mason and their collaborators substituted the polymerase with the reverse transcriptase from HIV. As with the DNA platform, this enzyme incorporates fluorescent molecules across modified bases more slowly than it does across unmodified ones, giving each modification its own 'kinetic signature'.

Technical hurdles remain, however, as RNA poses challenges not seen in DNA. One is that RNA readily folds in on itself to form loops and knots, so it is highly structured. In its study15, Oxford Nanopore attached RNA to cDNA, which helps 'iron out' the secondary structures and move the RNA through the pore. A second challenge is that RNA degrades more easily than DNA, a problem that may stymie long-read sequencing approaches.

Yet another challenge, on the data side, is the sheer number of RNA modifications. Recognition of multiple different modifications on the same RNA would require massive training sets to teach the detection software to distinguish one modification from another. Winston Timp, a biomedical engineer at Johns Hopkins University in Baltimore, Maryland, has been using Oxford Nanopore technology to develop new methods for detecting specific DNA modifications17. He now plans to move into RNA modifications, developing a training set that will help recognize m6A modifications. “The problem is, we don't know how diverse on a given molecule the modifications are,” he says. “But this is something we can probe. It's an exciting area of research.”Footnote 1