Hugging Face: SmolVLM2: On-device video understanding with 2.2B, 500M, and 256M models and MLX-ready APIs | SignalBreak | SignalBreak