What is JUMP-CP consortium?

What is JUMP-CP consortium? image

The Joint Undertaking in Morphological Profiling-Cell Painting Gallery (JUMP-CP) is a large-scale, high-content imaging reference dataset created with the Cell Painting technology that assesses the morphological effects of tens of thousands of drug and genetic perturbations on cells.

For researchers in pharma, biotech, or academia looking to uncover the mechanisms of action, cytotoxicity, or possible on- and off-target effects of drugs in their development pipeline or potential genes of interest, JUMP-CP is an impressive publicly available resource that acts as a great starting point.

So, let’s dive in and explore what JUMP-CP is and how researchers can use it to get the most value for their large-scale perturbation studies.

 

What’s the goal of the JUMP-Cell Painting consortium?

Established in 2020 as a collaboration between researchers at the Broad Institute of MIT and Harvard with around ten pharmaceutical and two non-profit partners, JUMP-CP aimed to create a new “data-driven approach to drug discovery based on cellular imaging, image analysis, and high-dimensional data analytics” and to “make cell images as computable as genomes and transcriptomes.”

Achieving this would help to overcome the major bottleneck in pharma pipelines of determining the mechanism of action of potential therapeutics, which remains difficult, expensive, and time-consuming. This functional knowledge is crucial to better hit prioritization that limits drug failures further down the pipeline, when unforeseen compound activity, off-target effects, or toxicity might stop drug development in its tracks (Sun et al., 2022).

 

Cell Painting lacked a gallery

The world of transcriptomics and genomics has benefited hugely from high-quality specialist reference datasets like The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression (GTEx) project, the Connectivity Map, and RefSeq, among many others. Scientists routinely use these to compare their self-generated omics data with the reference to inform hypotheses, validate novel findings, benchmark studies for reproducibility, or perform cross-dataset comparisons and meta-analyses to extrapolate findings to other areas. The vast amounts of omics repository data available are crucial for training the next generation of artificial intelligence-based data analytics methods (Ahmed et al., 2024).

Yet, while large image-based perturbation repositories do exist, they rarely use standardized assays like Cell Painting, limiting the potential discoveries made with the technology and likely hindering its use in drug development pipelines. This is especially problematic for researchers taking a multimodal approach combining Cell Painting with transcriptome-wide gene expression profiling, as there’s limited morphological phenotypic reference data available. Enter JUMP-CP.

 

JUMP-Cell Painting Gallery to the rescue

JUMP-CP is an astonishing achievement. The dataset includes Cell Painting image-based profiles for over 116,000 unique compounds, CRISPR-Cas9 knockouts of 7,975 genes, and over-expression of 12,602 genes in the human U2OS osteosarcoma cell line (Chandrasekaran et al., 2023).

It’s a bounty of billions of single-cell profiles with a size of approximately 700 TB, with an increasing number of computational tools being developed to analyze the data (Stossi et al., 2024; Weisbart et al., 2024).

 

Training AI with the JUMP-Cell Painting Gallery

JUMP-CP has already published most parts of the dataset. For instance, the primary benchmark dataset, CPJUMP1, provides robust training data for novel artificial intelligence models crucial for analyzing high-content high-throughput image-based morphological profiles in response to chemical and genetic perturbations (Chandrasekaran et al., 2024a). To do this, the study focused on gene and compound pairs with previously established relatively ‘ground truth’ relationships or mechanisms of action.

This prior base knowledge is necessary to develop, refine, and critically assess deep learning models to aid accurate functional predictions for compounds or genes where no ground truth is known. It’s a compendium of approximately three million images and morphological profiles for around 75 million single cells from the U2OS osteosarcoma and A549 lung carcinoma cell lines at two-time points for a curated set of 160 genes and 303 compounds.

 

The JUMP-Cell Painting Gallery genetic perturbation screen

Another recent pre-print from JUMP-CP focused solely on establishing morphological profiles for genetic perturbations of around 75% of the protein-coding genome in the U2OS cell line at a one-time point (Chandrasekaran et al., 2024b).  The researchers used both CRISPR-Cas9 knockouts and overexpression of genes and found morphological phenotypes for 70% of tested knockouts (5,546 genes) and 56% of overexpressed genes (7,031 genes), some with previously undetected relationships.

However, many overlapping overexpressed and knocked-out gene pairs didn’t produce inverse relationships as expected, likely due to a variety of technical and biological reasons that would be made clearer if bulk transcriptional information were available for each sample treatment.

 

Conclusion

In conclusion, the JUMP-CP represents a groundbreaking advancement in the field of high-content imaging and drug discovery. By providing an extensive and standardized dataset of morphological profiles, JUMP-CP addresses a critical gap in the availability of reference data for image-based perturbation studies. This resource not only enhances our understanding of the morphological effects of various genetic and chemical perturbations but also serves as a valuable tool for training advanced AI models.

For researchers in the pharmaceutical, biotechnology, and academic sectors, JUMP-CP offers a robust starting point for exploring the mechanisms of action, cytotoxicity, and potential off-target effects of compounds and genes of interest. However, the true potential of JUMP-CP is unlocked when combined with complementary high-throughput technologies to achieve multimodal insights that are more powerful together. This could accelerate the development of novel therapeutics and gain deeper insights into the complex biological processes underlying disease.

Read our following blog article where we discuss combining this impressive resource with ultra-high-throughput MERCURIUS™ DRUG-seq for large-scale screening projects.

 

References

  • Ahmed, Z., et al. Artificial intelligence for omics data analysis. BMC Methods, 1(1), p.4
  • Chandrasekaran, S.N., et al. JUMP Cell Painting dataset: morphological impact of 136,000 chemical and genetic perturbations. BioRxiv, pp.2023-03.
  • Chandrasekaran, S.N., et al. Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations. Nature Methods, 21(6), pp.1114-1121.
  • Chandrasekaran, S.N., et al. 2024b. Morphological map of under-and over-expression of genes in human cells. bioRxiv, pp.2024-12.
  • Stossi, F., et al. 2024. SPACe: an open-source, single-cell analysis of Cell Painting data. Nature Communications, 15(1), p.10170.
  • Sun, D., et al. Why 90% of clinical drug development fails and how to improve it? Acta Pharmaceutica Sinica B, 12(7), pp.3049-3062.
  • Weisbart, E., et al. Cell Painting Gallery: an open resource for image-based profiling. Nature Methods, 21(10), pp.1775-1777.