Hugging Face: Vision-Language Models: CLIP-style contrastive learning and PrefixLM via Hugging Face Transformers | SignalBreak | SignalBreak