OpenMed trains multi-species codon-optimization models for mRNA design; CodonRoBERTa-large-v2 highest performer
AI Impact Summary
OpenMed built an end-to-end protein AI pipeline (structure prediction, sequence design, codon optimization) and showed CodonRoBERTa-large-v2 delivers the strongest codon-level modeling performance (perplexity 4.10, CAI 0.40) among several contenders. They scaled to 25 species and produced 4 production-ready models in 55 GPU-hours, delivering a species-conditioned system not available in other open-source efforts. This pipeline integrates ESMFold, ProteinMPNN, and a bespoke mRNA codon optimizer, enabling a fast-from-concept-to-expression workflow for therapeutic proteins across species, with clear potential to shorten design-to-synthesis timelines for mRNA vaccines and recombinant proteins.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info