Counting the things short-reads are good for

What's the point of short-reads if long-reads do basically everything in genomics better? There are five, FIVE, *thunderclap*, applications where short reads excel, ah, ah, ah!

Counting the things short-reads are good for
🗞️
This post originally appeared in the Omic.ly Premium 35 newsletter. To get Omic.ly Premium in your inbox every Sunday, subscribe to the Premium tier or higher.

One. Exon Target Capture:

Target capture sequencing is a technique that uses RNA or DNA probes to bind to and pull out specific sequences of interest. This is usually done for targeted gene sequencing applications all of the way up to full exome sequencing. Short-reads are ideal for this application because most coding regions of genes, or exons, are ~200bp. The standard 150x150 paired end technology is perfect for picking up these sequences without wasting sequencing dollars on introns or other untranslated regions of the genome.

Two. ctDNA:

Circulating tumor DNA sequencing methods are sometimes referred to as liquid biopsies. This method sequences the DNA that is shed by tumors into the bloodstream. These fragments of DNA are usually very short, in the 150-200bp size range and so are a perfect fit for short-read sequencing methods.

Three. cfDNA:

Circulating cell free DNA is similar to ctDNA except the distinction here is that cell free DNA doesn't originate from tumors, it originates from normal cells. This application is best known for its use in non-invasive prenatal testing where cell free DNA that's released by the placenta can be assayed to determine if a fetus has chromosomal abnormalities early in a pregnancy.

Four. FFPE:

Formalin Fixed Paraffin Embedded tissue is easily the worst sample type ever (gut microbiome/wastewater samples are a close second). The process of preserving biopsies back in olden times required placing a tissue sample into formaldehyde. To then look at these samples under a microscope, they were embedded in paraffin wax to make them easier to slice up and get onto a slide. While this is a reasonable process for microscopy, it makes getting usable DNA out of them a total pain in the ass. This means DNA ends up getting very fragmented which works in the favor of short-read sequencing methods! Maybe someday we'll be able to get physicians to update their methods to something more sequencing friendly, like flash freezing.

Five. Counting:

Counting applications are those where we're trying to figure how much of something is in a sample. Short-read sequencing is actually a really good high throughput multiplex counting method! That's especially true when using 25x25 - 75x75 chemistries which are the most cost efficient. Examples here are: counting sequence tags (like for proteomics), counting RNAs (expression profiling), bacteria/microbiome community profiling (genus level), differential methylation (liquid biopsy), chromosome counting (like in non-invasive prenatal testing), and epigenetic profiling.

So, short-reads will continue to have their place, but the go-to moving forward for clinical genomes and transcriptomes will be long-reads.

That's something you can count on! Ah, ah, ah!


Omic.ly Premium 35
HOT TAKE: Molecular Loop’s dual barcode patent litigation should be giving everyone in the sequencing industry night terrors