Falcon2-11B LLM and Falcon2-11B VLM released with 8k context and multilingual support
AI Impact Summary
Falcon2-11B introduces an 11B-parameter LLM and an 11B VLM with image understanding, built to deliver improved usability and multimodal capabilities while targeting cheaper inference than larger models. It was trained on over 5,000 GT tokens from RefinedWeb and multilingual data across 11 languages, using a 60-layer transformer with 8k token context in later stages and a 3D parallelism setup (TP=8, DP=128) aided by ZeRO and Flash-Attention 2. This combination enables multilingual, image-assisted chat and downstream tasks like code generation, with strong parity to larger models at a smaller size, making it attractive for open-source deployments and cost-conscious projects.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info