Talk to an expert
, ,

What is the Connectivity Map, and how can it accelerate drug discovery?

The Connectivity Map is a large-scale comprehensive catalog of cellular transcriptomic signatures that represent systematic genetic or pharmacologic perturbations of many human disease cell types (Lamb et al., 2006; Subramanian et al., 2017).

The aim of the Connectivity Map project from the Broad Institute and the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) consortium is to use these signatures to connect diseases with the genes that cause them and drugs that could treat them.

It is one of the most extensive functional databases of its kind.

But what actually is the Connectivity Map, and how can it accelerate the discovery of novel therapeutics?

Read on to find out.


The Connectivity Map is a functional reference database of connections between genes, drugs, and diseases

The Connectivity Map project was developed to increase our understanding of the function of disease-associated genes and the effects of small molecules on diverse cellular processes.

This is important because while we know the identity of many disease-associated genes, their function in cellular pathways and networks often remains unclear.

Similarly, there are now more large-scale small molecule libraries available than ever, but these require systematic methods to infer the mechanism of action, molecular targets, or off-target effects of a given compound (Mendez et al., 2019).

So, the Connectivity Map aims to address these issues by providing a functional reference resource of over one million transcriptional signatures from over 25,000 different perturbations.

If two perturbations produce similar transcriptional profiles, they are ‘connected’ and may share common targets or mechanisms of action. Conversely, if two perturbations produce opposite gene expression signatures, they might have antagonistic effects.

The published collection now contains transcriptional profiles of different disease cell types treated with around 20,000 compounds and perturbations of over 5,000 genes with cDNA overexpression or short hairpin RNA-mediated knockdown.


The Connectivity Map informs drug discovery and development

The biomedical community has already used the insights gained from the Connectivity Map to develop novel hypotheses for treatments of cancer, neurological diseases, and infection, to functionally annotate disease-associated genetic variants, and to inform clinical trials (Subramanian et al., 2017; Musa et al., 2018; Rivera et al., 2022; Hernandez et al., 2023; Zhao et al., 2023).

This vast collection of functional connections has helped accelerate these therapeutic discoveries by informing on:

  • The biological pathways and functions of genes
  • The molecular targets and mechanisms of action of drugs
  • The potential side effects and interactions of drugs
  • The potential repurposing of drugs for new diseases
  • The potential of combinations of therapies for diseases

But how did the LINCS consortium generate such a treasure trove of perturbational and transcriptional information?


The ‘reduced representation’ L1000 transcriptomic profiling method

The most recently published Connectivity Map from 2017 includes over one million expression signatures, a 1,000-fold scale-up compared to the 2006 pilot study (Lamb et al., 2006; Subramanian et al., 2017).

And their research continues. A collaboration of six LINCS Data and Signature Generation Centers is actively producing even more perturbation data (Keenan et al., 2018).

But to produce gene expression profiles at this scale, the Connectivity Map requires a cost-effective, high-throughput technology that generates robust transcriptional profiles.

Bulk RNA-seq on this many samples was simply too expensive when the project started.

So, to address these cost and scalability issues, the LINCS consortium developed a reduced representation expression profiling method called L1000.

This technology measures the abundance of 978 “landmark” genes using bead-based fluorescent labeling of mRNA transcripts followed by flow-cytometry-based detection (Subramanian et al., 2017).

Inference algorithms then use these 978 landmark gene measurements to infer the expression of an additional 11,350 genes with a strong degree of similarity of profiles to 3,000 RNA-seq samples from GTEx, another large gene expression consortium, indicating its utility in generating transcriptional profiles at scale.


How is the Connectivity Map used to infer gene, compound, and disease connections?

The majority of compounds and genetic perturbations within the one million expression signatures had unknown molecular targets, biological pathways, and mechanisms of action.

Therefore, the researchers reasoned that including a smaller subset of signatures called “Touchstone” for around 2500 FDA-approved well-characterized small molecule compounds would act as a powerful reference to functionally annotate poorly-characterized genes or small molecules with unknown functions.

By correlating implicated genes and pathways, the cellular signatures from this “Touchstone” dataset can connect cellular signatures from the additional 17,500 small molecules with known mechanisms of action.

Drugs are then paired with diseases using sophisticated pattern-matching methods with high resolution and specificity.

A similar process applies to users wishing to query lists of disease- or phenotype-associated differentially expressed genes.

The LINCS project website provides all data alongside many analysis tools to find connections relevant to diverse research questions.


Ultra-high-throughput bulk RNA-seq for a Connectivity Map of the future?

While this L1000 technology has proved valuable in diverse biomedical settings, only 978 genes are measured. An additional 11,350 genes are inferred, limiting knowledge discovery for half of the human protein-coding genes.

In contrast, the development of transcriptome-wide bulk RNA-seq methods based on early sample barcoding and multiplexing, such as MERCURIUS™ Bulk RNA-Barcoding and Sequencing (BRB-seq) and RNA-extraction free MERCURIUS™ DRUG-seq, now provides reliable expression data for over 20,000 genes in thousands of samples in the same sequencing run (Alpern et al., 2019).

This has driven down the cost of whole genome transcriptomic profiling while dramatically increasing throughput, allowing researchers to include more samples, cell types, compounds, and experimental conditions in their screening projects.

These approaches are suitable for bespoke large-scale ultra-high-throughput compound screenings similar or complementary to the Connectivity Map and may further accelerate the discovery and development of novel therapeutics.

Don't hesitate to get in touch with us to learn more about the Connectivity Map or how MERCURIUS™ BRB-seq and MERCURIUS™ DRUG-seq can help your next compound screen.



  • Alpern, D. et al. (2019) ‘BRB-seq: ultra-affordable high-throughput transcriptomics enabled by bulk RNA barcoding and sequencing’. Genome biology, 20(1), pp.1-15. Available at:

  • Hernandez, S.J. et al. (2023) ‘An altered extracellular matrix–integrin interface contributes to Huntington’s disease-associated CNS dysfunction in glial and vascular cells’. Human Molecular Genetics, 32(9), pp.1483-1496. Available at:

  • Keenan, A.B. et al. (2018) ‘The library of integrated network-based cellular signatures NIH program: system-level cataloging of human cells response to perturbations’. Cell systems, 6(1), pp.13-24. Available at:

  • Lamb, J. et al. (2006) ‘The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease’. Science, 313(5795), pp.1929-1935. Available at:

  • Mendez, D. et al. (2019) ‘ChEMBL: towards direct deposition of bioassay data’. Nucleic acids research, 47(D1), pp.D930-D940. Available at:

  • Musa, A. et al. (2018) ‘A review of connectivity map and computational approaches in pharmacogenomics’. Briefings in bioinformatics, 19(3), pp.506-523. Available at:

  • Rivera, A.D. et al. (2022) ‘Drug connectivity mapping and functional analysis reveal therapeutic small molecules that differentially modulate myelination’. Biomedicine & Pharmacotherapy, 145, p.112436. Available at:

  • Subramanian, A. et al. (2017) ‘A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles’, Cell, 171(6), pp. 1437-1452.e17. Available at:

  • Zhao, Y. et al. (2023) ‘Decoding Connectivity Map-based drug repurposing for oncotherapy’. Briefings in Bioinformatics, p.bbad142. Available at: