Direct RNA sequencing might tell us about more than just the bases
Post-transcriptional modifications: Why sequencing RNA directly is the future of transcriptomics
In Francis Crick’s first description of the central dogma in 1957 he stated that genetic information moves in one direction: DNA codes RNA, and RNA codes proteins.
We are now aware of multiple exceptions to this theory but we’ve also started to learn more about how this is controlled at the cellular level.
This is the work of regulatory proteins that bind to nucleic acids and promote or repress the transcription of DNA, or the translation of RNA into proteins.
While the sequences that these proteins recognize are important, we have also learned that modifications to both DNA and RNA play a significant role in these processes.
DNA can be modified through the methylation of cytosine which affects which mRNAs are made (or not made!).
This is one component of Epigenetics!
But RNAs are quite different and we know of over 150 modifications that can be added to them.
So, what are all of these modifications, and what are they doing?
The most common mRNA modification is the addition of a methyl cap made of N7-methylguanosine (m7G).
This cap helps protect the mRNA from degradation but also aids in the binding of proteins involved in its translation to protein.
But residues outside of this cap structure are also frequently modified and include: N6,2′-O-dimethyladenosine (m6Am), N6-methyladenosine (m6A), and pseudouridine (Ψ) with rarer modifications including N1-methyladenosine (m1A), 5-methylcytidine (m5C), 5-hydroxymethylcytidine (hm5C), N4-acetylcytidine (ac4C) and inosine (I).
These all have context dependent effects which are still being unraveled, but we know they are important for nuclear export, regulating translation, or marking specific RNAs for destruction.
We also know that these modifications play a critical role in regulating how cells respond to changing conditions within their environment through sequestering RNAs, ramping up translation, or quickly destroying RNAs to make room for others.
Despite everything we have learned about these modifications, this field of ‘epi-transcriptomics’ is still in its early days.
And that’s because we haven't had a good method to look at all of these things in a high throughput way!
Another contributing factor here is that in traditional transcriptomics, RNA has to be converted to copy DNA (cDNA) to be sequenced and that conversion process doesn’t preserve these modifications.
So, the only way to really survey all of this additional information is to sequence the RNA directly!
Luckily, we have people working on this problem with some early success being reported.
Oxford Nanopore has achieved 95% (~Q13) RNA sequencing accuracy with promising progress in being able to call m6A base modifications!
While there’s still work to do, it's clear that accurately detecting RNA base modifications will be crucial for understanding their role in human health and disease.