High-Throughput Transcriptomics for Biomarker and Mechanism of Action Discovery: From Gene Expression Signatures to Clinical Outcomes

High-Throughput Transcriptomics for Biomarker and Mechanism of Action Discovery: From Gene Expression Signatures to Clinical Outcomes image

Biomarker discovery using gene expression signatures has transformed pharmaceutical discovery and development, from early hit triage and target validation to toxicology, dose optimization, and patient stratification.

Yet, traditional transcriptomic methods, like RNA-seq, struggle to scale to the thousands or tens of thousands of samples required for robust biomarker discovery and disease mechanism identification, especially in the era of AI-aided predictive modeling. High costs and sample-by-sample processing steps remain key bottlenecks that slow discovery pipelines and limit predictive inference, despite the proven utility of gene expression signatures as biomarkers.

This article explores how early-phase drug discovery teams use gene expression biomarkers to understand disease mechanisms, de-risk portfolios and enable machine-learning-driven discovery, and how ultra-scalable MERCURIUS™ DRUG-seq overcomes the bottlenecks of traditional RNA-seq methods for screening-scale insights.

Key Takeaways

  • Transcriptomic biomarkers provide deeper, earlier, and more predictive insights into toxicity, efficacy, and mechanisms of action than classical single-endpoint biomarkers.
  • Gene expression signatures can de-risk toxicology by revealing drug-induced liver injury (DILI) and other safety liabilities well before clinical markers or histopathology.
  • Transcriptomic biomarkers uncover disease mechanisms that inform target selection, compound triage, and clinical trial stratification.
  • High-throughput, whole-transcriptome RNA-seq is essential for robust biomarker discovery, yet traditional RNA-seq is too expensive and low-throughput to support screening-scale studies.
  • MERCURIUS™ DRUG-seq overcomes the scaling and cost bottlenecks of conventional RNA-seq, enabling thousands of samples to be processed in a single experiment with whole-transcriptome coverage.
  • MERCURIUS™ DRUG-seq allows researchers to generate high-dimensional, high-quality training datasets for predictive toxicology and pathway-level MoA inference, enabling robust biomarker discovery, dose–response transcriptomic fingerprints, more effective experimental model selection and early detection of mechanistic liabilities.

How High-Throughput Transcriptomics Drives Biomarker Discovery for Pharma

 

The Food and Drug Administration (FDA) describes a biomarker as ‘a defined characteristic that is measured as an indicator of normal biological processes, pathogenic processes, or responses to an exposure or intervention’ (1).

Biomarker discovery is nothing new. However, classical single-endpoint biomarkers, like serological or biochemical indicators, often provide only limited specificity and predictive capacity. In contrast, transcriptome-wide, systems-level gene expression biomarkers offer researchers greater predictive power for earlier detection of safety concerns or responses to drugs (2). Multi-dimensional transcriptomic biomarkers also inform numerous aspects of biology, including disease pathways, progression, and mechanisms of action.

By integrating transcriptomic biomarker and disease mechanism detection into early-stage drug discovery, researchers can now screen thousands of potential drug candidates against specific expression signatures. This approach reduces the likelihood of late-stage failures due to unforeseen efficacy or safety issues, especially when paired with next-generation machine learning models (2). Gene expression biomarkers can impact almost every stage of the developmental pipeline, and ultra-high-throughput transcriptomic technologies, like MERCURIUS™ DRUG-seq, serve extensive roles in biomarker discovery at different pipeline phases (Table 1).

Discover how MERCURIUS™ DRUG-seq uses massive sample multiplexing to generate gene expression profiles for hundreds to thousands of samples in a single tube.


Table 1: Overview of the core roles of gene expression biomarkers in pharma pipelines and how MERCURIUS™ DRUG-seq helps their discovery (Table modified from (9))

 

Gene Expression Biomarkers De-Risk Toxicology

 

An example of a powerful gene expression-based biomarker originated in early toxicogenomics work in 2005. Researchers identified a 35-gene expression signature by microarray that was predictive of drug-induced liver injury (DILI) in the kidneys of rat models treated with 64 nephrotoxic or non-nephrotoxic compounds. It helped suggest toxicity well before clinical enzymes or histological markers were elevated (3).

 

Twenty years on, at the end of 2025, toxicology teams now have access to large-scale RNA-seq-based databases, such as DILImap, which provides transcriptome-wide gene expression profiles for 300 compounds at multiple concentrations in primary human hepatocytes (PHHs) (2).

Gain deeper insights into how high-throughput transcriptomics benefits toxicology and safety assessment in our comprehensive article.

 

When paired with AI-powered predictive models, such as ToxPredictor, these gene expression signatures flagged DILI risk in costly, high-profile phase III clinical failures, including BMS-986142, TAK-875, and Evobrutinib, that were overlooked by preclinical animal safety studies (2). The ToxPredictor machine learning model outperformed over 20 pre-clinical safety assessment approaches, including mechanistic assays, cytotoxicity markers, physicochemical properties, and in-silico models (2).

 

These advances highlight how transcriptome-wide gene expression signatures can provide multi-dimensional insights into hepatotoxic pathways and DILI mechanisms while supporting early de-risking and actionable safety decisions at a resolution impossible to achieve with single-endpoint biomarkers. It represents a shift from relatively simple biomarker outputs to more predictive, systems-level, dynamic inference of the biological effects of perturbations with high-throughput transcriptomic outputs.

 

Learn how transcriptomics in short-term in vitro toxicology studies now delivers the mechanistic depth and reproducibility needed to make them viable alternatives to long-term animal models.

 

Transcriptomic Biomarker Discovery Highlights Disease Mechanisms That Guide Clinical Trial Design

 

Disease-mechanism biomarkers allow discovery teams to link compound effects to pathway dysregulation, enabling more confident hit prioritization and reduced late-stage failures. For instance, clinical trial recruitment and design rely on segmenting participants by their molecular phenotypes to identify those most likely to respond to targeted therapies. DNA-based markers are often used, such as BRAF V600E mutations in melanoma and colorectal cancer trials or KRAS G12C mutations in lung, pancreatic, and colorectal cancer trials (4).

 

However, gene expression profiles now also serve as powerful biomarkers to identify specific disease mechanisms and target pathways, such as inflammation, thereby helping to stratify trial participants after the biomarkers' rigorous clinical validation.

 

For instance, an 18-gene T-cell inflamed expression profile was instrumental in the success of the immunotherapy pembrolizumab (Keytruda) by predicting which patients would benefit from programmed cell death protein–1 (PD-1) blockade across dozens of clinical indications (5). This gene expression profile was relevant across more than 20 tumor types in the KEYNOTE trials and was independent of tumor mutational burden, thereby providing a clearer picture of the tumor microenvironment than possible with a single, more static biomarker (6).

 

This inflammation-related 18-gene expression biomarker panel was identified through an iterative process that used targeted, probe-based gene-expression profiling of an initial set of 680 tumor- and immune-related genes. However, because only 680 were investigated initially, nearly 20,000 genes were excluded.

 

While the rational selection of specific genes was successful in this instance, for less well-defined disease mechanisms or compound mechanisms of action, unbiased, transcriptome-wide, high-throughput RNA-seq-based technologies, such as MERCURIUS™ DRUG-seq, are better positioned to uncover disease mechanism-predictive and preclinical biomarkers than targeted gene panel approaches (2).

 

A Note on Regulatory and Qualification Challenges

 

Strict frameworks are implemented by both the FDA and the European Medicines Agency (EMA) for biomarker validation, which require rigorous evidence of analytical and clinical validity, and clinical utility depending on the context of use. Also depending on the biomarker application, distinct levels of validation and supporting data are required. MERCURIUS™ DRUG-seq is for research-use only.

 

Read our article to explore how The OASIS Consortium is using high-throughput transcriptomics to build more predictive, human-relevant toxicology models and data guidelines to overcome regulatory hurdles.

 

 

How High-Throughput Transcriptomics Identifies Disease Mechanisms

 

Similarly, by providing a more thorough understanding of underlying disease mechanisms, high-throughput transcriptomics has also accelerated rational drug design and development (7).

By understanding the mechanisms of disease, including pathways and regulatory nodes that drive pathology, drug discovery teams can use the mechanistic knowledge gained from gene expression signatures to select more appropriate disease models, which better mimic physiological disease features and pathways, such as primary cell types, precisely engineered cell lines, or 3D spheroids (8).

Gene expression profiling also provides the mechanistic context required to select the right targets, design appropriate assays, and interpret compound responses with confidence. Together, this understanding ultimately leads to improved and more efficient studies that can accelerate clinical translation, reduce costs, and directly benefit all stakeholders (8).

Explore how researchers used MERCURIUS™ DRUG-seq to discover brain-penetrant "RNA-active" compounds with favorable mechanistic and safety-relevant profiles to de-risk hit identification and prioritize candidate repurposing.

 

MERCURIUS™ DRUG-seq for High-Throughput Biomarker Discovery and Disease Mechanisms Identification

Overall, RNA-based biomarkers that identify novel druggable targets or flag toxicity or safety concerns early in the pipeline can ultimately improve the success rate and efficiency of drug discovery and development by guiding more confident go/no-go decisions (9).

Previously, this was challenging to achieve with outdated sample-by-sample RNA-seq or target-panel gene profiling, but scalable methods such as MERCURIUS™ DRUG-seq now make large-scale biomarker detection from thousands of samples routine.

MERCURIUS™ DRUG-seq allows discovery teams to perform screening-scale transcriptomics with the same experimental designs they apply to high-throughput phenotypic or biochemical screens, at a fraction of the effort and cost required to build large-scale databases, such as DILImap.

This makes it possible to:

• Generate high-resolution dose–response transcriptomic fingerprints

• Detect early mechanistic or toxicogenomic liabilities before secondary assays

• Identify disease-relevant pathways that guide target selection

• Build custom AI-ready biomarker libraries specific to a program

Download our real MERCURIUS™ DRUG-seq datasets to explore how high-throughput transcriptomics can help biomarker and disease mechanism discovery.

References

  1. FDA-NIH Biomarker Working Group. BEST (Biomarkers, EndpointS, and other Tools) Resource [Internet]. Silver Spring (MD): Food and Drug Administration (US); 2016-. Glossary. 2016 Jan 28 [Updated 2025 Jan 16]
  2. Bergen V, Kodella K, Srikrishnan S, Barrandon O, Anderson S, Rogers-Grazado M, Fowler C, Beyene H, Robichaud N, Fulton T, Lapchyk N. A large-scale human toxicogenomics resource for drug-induced liver injury prediction. Nature Communications. 2025 Nov 13;16(1):9860
  3. Fielden MR, Eynon BP, Natsoulis G, Jarnagin K, Banas D, Kolaja KL. A gene expression signature that predicts the future onset of drug-induced renal tubular toxicity. Toxicologic pathology. 2005 Oct;33(6):675-83.
  4. Waarts MR, Stonestrom AJ, Park YC, Levine RL. Targeting mutations in cancer. The Journal of clinical investigation. 2022 Apr 15;132(8).
  5. Ayers M, Lunceford J, Nebozhyn M, Murphy E, Loboda A, Kaufman DR, Albright A, Cheng JD, Kang SP, Shankaran V, Piha-Paul SA. IFN-γ–related mRNA profile predicts clinical response to PD-1 blockade. The Journal of Clinical Investigation. 2017 Aug 1;127(8):2930-40.
  6. Cristescu R, Mogg R, Ayers M, Albright A, Murphy E, Yearley J, Sher X, Liu XQ, Lu H, Nebozhyn M, Zhang C. Pan-tumor genomic biomarkers for PD-1 checkpoint blockade–based immunotherapy. Science. 2018 Oct 12;362(6411):eaar3593.
  7. McCrea LT, Batorsky RE, Bowen JJ, Yeh H, Thanos JM, Fu T, Perlis RH, Sheridan SD. Identifying brain-penetrant small-molecule modulators of human microglia using a cellular model of synaptic pruning. Neuropsychopharmacology. 2025 May 9:1-9.
  8. Loewa A, Feng JJ, Hedtrich S. Human disease models in drug development. Nature reviews bioengineering. 2023 Aug;1(8):545-59.
  9. Kraus VB. Biomarkers as drug development tools: discovery, validation, qualification and use. Nature Reviews Rheumatology. 2018 Jun;14(6):354-62.
 

Related posts