Hugging Face: SmolVLM introduces 256M and 500M vision-language models with ONNX/transformers support | SignalBreak | SignalBreak