Hugging Face: Google PaliGemma vision-language models: SigLIP-So400m encoder with Gemma-2B decoder | SignalBreak | SignalBreak