RNA sequencing (RNA-seq) is an indispensable tool in the drug development pipeline, enabling researchers to explore gene expression profiles, uncover mechanisms of action, and identify biomarkers of drug sensitivity or resistance. However, not all RNA-seq approaches are created equal, and the success of your study hinges on thoughtful experimental design and choosing the right technology.
Read on to learn more about the key factors researchers should consider when planning a new RNA-seq-based experiment, from initial design to downstream data analysis.
1. Start with a Clear Hypothesis and Objective
Every RNA-seq experiment should begin with a well-defined research question and goal.
For instance, are you profiling transcriptomic changes to identify potential drug targets? Or do you aim to validate the impact of a candidate compound on cellular responses? Alternatively, you might want to screen a large number of conditions, doses, or drug combinations to generate comprehensive transcriptional data crucial for understanding mechanisms of action or toxic effects.
Your scientific goal will directly influence decisions like the best model system, sample size, sequencing depth, and the type of RNA-seq method to use.
For any goal, you must consider which type of RNA-seq technology and sequencing data would most likely allow you to achieve your objective. If you want to discover differentially expressed genes between many conditions, unbiased 3’ mRNA-seq might be the best option. For the expression of defined genes, targeted profiling could work, despite certain drawbacks. However, if you’re interested in isoform differences or changes in alternative splicing, full-length RNA-seq technologies would be more suitable.
2. Choose the Correct RNA-seq Technology for the Job
Choosing the right RNA-seq library preparation strategy is key to unlocking meaningful transcriptomic insights in drug discovery projects that don’t break the bank. Whether it's target identification, biomarker discovery, or transcriptome profiling, your experiment’s goals should directly inform the design of your wet lab workflow.
Several factors influence the choice of technology:
- The type of sample (cells or RNA)
Different technologies are optimized for different starting materials. Starting samples could range from 2D cell cultures common to drug discovery and screening pipelines, 3D organoid models for more physiological cellular environments, blood, or RNA that’s isolated and purified from other model organisms.
Sample Type |
Suitable Library Preparation Technology |
2D cell culture |
MERCURIUS™ DRUG-seq |
Organoids |
MERCURIUS™ DRUG-seq (service) |
Isolated total RNA |
MERCURIUS™ BRB-seq |
Blood |
MERCURIUS™ Blood BRB-seq |
Table 1. Technologies suitable for particular sample types
Compared to traditional RNA-seq approaches, RNA-extraction-free 3′ mRNA-seq methods like MERCURIUS™ DRUG-seq allow researchers to process hundreds of cell or organoid samples simultaneously—straight from cell lysates—without compromising data quality for large-scale compound screens.
The technology removes tedious, time-consuming, and costly RNA isolation and cleanup steps necessary in traditional RNA-seq methods and the massively multiplexed, pre-amplification-free protocol increases sensitivity and improves gene detection at a sample throughput and cost impossible with traditional RNA-seq methods.
Alternatively, if you’ve already extracted RNA from a sample other 3’ mRNA-seq technologies, such as MERCURIUSä BRB-seq, offer the same massive sample multiplexing capacity and cost-efficiency for purified RNA samples.
For more complex samples like whole blood, contaminants like globin genes must be removed. MERCURIUSäBlood BRB-seq seamlessly integrates reagents that reduce the amount of globin mRNA in whole blood samples before sequencing to reduce noise from these highly expressed transcripts. The technology allows researchers to sequence hundreds of whole blood transcriptomes simultaneously at higher sensitivities and at much lower costs than traditional methods.
- Sample number and quality
Large-scale screens often investigate thousands to millions of different compounds or conditions. While traditional RNA-seq methods might be fine for small experiments with tens of samples, it’s practically impossible to achieve the throughput required at an acceptable cost for the types of screens necessary for efficient drug discovery. In contrast, 3’ mRNA-seq technologies were developed as scalable, cost-effective transcriptomic screening tools (Table 2).
Multiplexing Capacity per Tube |
Efficiency for Poor Quality Samples (RIN<8) |
Suitable Library Preparation Technology |
0 |
Poor |
Traditional RNA-seq |
96 to 384 |
High |
MERCURIUS™ DRUG-seq |
96 to 384 |
High |
MERCURIUS™ BRB-seq |
96 to 384 |
High |
MERCURIUS™ Blood BRB-seq |
Table 2. Per tube multiplexing capacity and efficiency for poor quality RNA samples for each technology
RNA quality is often a concern for patient-derived samples or RNA derived from FFPE tissues. While traditional RNA-seq methods often recommend RNA integrity (RIN) values greater than 8, 3’ mRNA-seq technologies provide robust, reproducible, and reliable data for RINs as low as 2 (Table 2) (Alpern et al., 2019).
- The data resolution required
3’ mRNA-seq approaches capture the poly-A tail of mRNA molecules and therefore only sequence the 3’ portion of transcripts. This achieves robust transcriptional levels for most genes in the genome at lower sequencing depths than traditional methods, but provides no information on other aspects of gene expression, like splicing.
So, if your objective is to study isoforms, fusions, non-coding RNAs, or transcript variants, you’ll need a full-length RNA-seq approach with mRNA enrichment or rRNA depletion.
3. Sequencing Considerations
Once you’ve selected your technology and prepared the libraries, you’ll need to choose the right sequencing depth, read length, sequencing mode, and sequencing platform that best suit your experimental objectives:
- Sequencing depth should align with the technology and goal:
- Standard bulk RNA-Seq: ~20–30M reads/sample
- 3′ mRNA-Seq: ~3–5M reads/sample
- High-throughput screening (pooled): ~200K–1M reads/sample
- Read length depends on both the experimental goals and the design of your sequencing library:
- Single-end (SR) reads of 75 to 100 bases are sufficient and widely used. This mode offers cost-effective coverage and is compatible with high-throughput studies.
- However, if your library includes inline barcodes or Unique Molecular Identifiers (UMIs), it’s important to ensure the sequencing mode captures these features. In such cases, paired-end (PE) sequencing may be necessary to properly decode sample and molecule-specific information.
- Paired-end sequencing with longer reads (PE75, PE100, or PE150) is suitable when you aim to investigate alternative splicing, rare transcripts, or gene fusions. This option provides the resolution needed to span exon junctions and complex transcript structures.
- Platform and instrument choice depend on the throughput, cost, and read length required. Illumina sequencers are commonly used, but platforms from Element Biosciences, MGI, or Singular Genomics can also be used with compatible or convertible libraries.
When in doubt, consult a bioinformatician or perform a pilot run to ensure your setup will deliver the resolution and sensitivity your project demands.
4. Account for Experimental Variables
To assess variability and statistical significance that are crucial for robust and reproducible conclusions, you’ll need to plan for:
- Biological replicates
Biological replicates refer to distinct samples originating from the same experimental condition or group. They help capture the inherent biological variability between different individuals, tissues, or cell cultures. A minimum of three biological replicates per condition is generally advised to ensure robust results. When working with readily available sample types, like cell lines or organoids, or when high variability might obscure true biological signals, increasing the number of replicates (ideally 4 to 8) can significantly enhance data reliability.
- Technical replicates
In contrast, technical replicates involve repeated measurements or preparations from the same biological sample and are useful for evaluating consistency in the experimental workflow. Nonetheless, biological replicates remain most essential for capturing meaningful biological insights. - Controls
Depending on your overall hypothesis and goal, your experiment must use the appropriate controls. This could be untreated or reference conditions that allow you to differentiate between true drug effects and background noise. Sufficient numbers of biological replicates of controls and experimental conditions are essential to making statistically robust conclusions with sufficient power.
5. Experimental Setup and Conditions
Careful planning of your experimental setup is essential for meaningful results. Critical factors include the type and number of conditions tested, sample timing, compound concentrations, and plate configurations. Get these right and you’ll have a powerful dataset for discoveries, get them wrong and you could waste precious time and money on underpowered studies with an inappropriate technology that fails to answer your research questions.
- Choice of Cell Model or Organism
Selecting a biologically relevant model is fundamental to understanding drug responses. For complex systems, patient-derived organoids or animal models may be appropriate. In retrospective studies, sample availability (e.g., archived patient material) can be a major limitation to consider. - Treatment Conditions and Controls
In most RNA-seq studies supporting drug discovery, samples treated with a compound are compared to controls. Commonly used models include immortalized cell lines, appreciated for their accessibility and consistency. Variables like seeding density, culture conditions, compound exposure time, and viability can introduce unwanted variability, so pre-experiment testing is advised. Be sure to define proper untreated or vehicle controls early on. Different sample types, like blood or FFPE tissues, may require protocol adaptations due to their unique challenges. - Timing and Drug Response
Drug-induced changes in gene expression can be transient or delayed, so selecting the right time points for sample harvesting and sequencing is critical for both a broad initial screen and advanced profiling of selected candidates. - Managing Batch Effects with Smart Plate Design
Batch effects are systematic variations arising from how the samples are collected and processed, rather than biological variation. These are common in large-scale studies due to logistical constraints like treatment timing or sample throughput. Samples are often grouped into batches for processing, which can introduce confounding variation.
High-throughput screening often relies on 384-well plates, whereas 96-well formats are typical for RNA extraction and other transcriptomic steps. When moving samples between plates, ensure the experimental design captures biological variation and allows for batch effect correction where necessary. Planning your experimental layout to minimize batch-related biases and facilitate computational correction is vital. - Implementing Experimental Controls
To maintain data quality and reproducibility, synthetic spike-in RNAs such as SIRVs or ERCC RNA are widely used. These controls help assess the technical performance of RNA-seq experiments, allowing for consistent quantification across samples. They also serve as internal standards for normalization, sensitivity checks, and overall process validation. - Pilot Testing
Running a small-scale pilot with a representative sample set helps refine both experimental and analytical workflows. This step enables protocol optimization and validation of conditions before scaling up. Pilot studies also allow you to compare alternative methods and avoid costly mistakes in larger experiments.
6. Plan Ahead for Data Analysis
Data analysis is often the most underestimated part of RNA-seq experiments. To ensure data integrity, early and thorough quality control (e.g., using tools like FASTQC and spike-in controls) is essential to catch issues before deeper analysis.
Once samples pass quality checks, data undergoes alignment, quantification, and normalization to prepare for biological interpretation. Expertise in both statistics and biology is critical for uncovering expression patterns, pathways, and drug response markers.
For deeper insights, integrating RNA-Seq with other omics data (like proteomics or metabolomics) using tools such as Omics Playground can reveal mechanisms of action and highlight new opportunities for drug development.
Conclusion: Better Design, Better Data
Strategic planning of high-throughput RNA-seq screens in drug discovery is essential for uncovering the nuances of the transcriptional effects of compounds on cells. Translating discoveries found with transcriptomics into meaningful biological conclusions can fuel confident, data-driven decisions that help focus the direction of your drug discovery and development pipelines, whatever the hypothesis.
With the right tools and a clear roadmap, RNA-seq can become a powerful driver of innovation in your screening process.
At Alithea Genomics, our MERCURIUS™ range of 3’ mRNA-seq products are scalable, cost-effective, and validated solutions designed for real-world research challenges. Let us help you design your next RNA-seq experiment for success. For more information, please contact us.