Hugging Face: Vision Language Models: new architectures, small models, and MoE decoders shaping multimodal deployment | SignalBreak | SignalBreak