InfoCapability

Arc Virtual Cell Challenge: ST and SE transformers for context-generalized CRISPR perturbation predictions

AI Impact Summary

Arc Institute's Virtual Cell Challenge provides a scalable benchmark for context generalization in biology by offering ~300k scRNA-seq profiles and a concrete two-model baseline to predict transcriptomic response to CRISPR perturbations in unseen cell types. The ST model leverages a Llama backbone with covariate-matched control and perturbation encoders plus a decoder, trained with Maximum Mean Discrepancy to align perturbed and control distributions; the SE model uses a BERT-like autoencoder with gene embeddings derived from ESM2 protein embeddings and a 2048-gene cell representation. This setup yields a testbed for evaluating cross-cell-type generalization and end-to-end perturbation prediction, enabling faster in silico screening workflows for biotech teams, provided the models are validated to handle biological variability and batch effects.

Affected Systems

State Transition Model (ST)

Date: Date not specified
Change type: capability
Severity: info

Arc Virtual Cell Challenge: ST and SE transformers for context-generalized CRISPR perturbation predictions

More from Hugging Face

Get alerts for Hugging Face