SmolLM: New 135M/360M/1.7B small LLMs released with SmolLM-Corpus
AI Impact Summary
SmolLM introduces a family of compact LLMs (135M, 360M, 1.7B) trained on a curated SmolLM-Corpus derived from Cosmopedia v2, FineWeb-Edu, and educational Python materials, aiming to preserve performance in smaller footprints. The dataset construction and ablations reference multiple large models (e.g., llama3-70B-Instruct, Mixtral-8x22B-Instruct-v0.1) to evaluate prompts, indicating a careful approach to data quality and generation style. This enables on-device inference and privacy-preserving deployment at edge scales, with potential cost reductions and new integration considerations for pipelines currently relying on larger models like Phi series, Qwen2, or MobileLLM.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info