Falcon-H1-Arabic launches hybrid Mamba-Transformer with 128K-256K context across 3B/7B/34B
AI Impact Summary
Falcon-H1-Arabic introduces a hybrid Mamba-Transformer architecture that runs State Space Models and Transformer components in parallel, fusing their outputs for each block. The launch includes 3B, 7B, and 34B variants with extended context windows of 128K tokens for the 3B model and 256K for the others, enabling long documents, multi-turn conversations, and domain workloads in Arabic that previously required input truncation. Post-training with supervised fine-tuning (SFT) and direct preference optimization (DPO) seeks to improve instruction following, coherence, and factual alignment across dialects and multilingual content.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info