DeepSeek-V4 enables 1M-token context for agent workflows with efficient long-context inference
AI Impact Summary
DeepSeek-V4 delivers a 1M-token context with two MoE checkpoints (DeepSeek-V4-Pro and DeepSeek-V4-Flash) and a dual attention scheme (CSA/HCA) to keep per-token FLOPs and KV-cache usage low. For agent workloads, it explicitly preserves reasoning across tool calls and turns, addressing common failure modes in long-running sessions where context would otherwise be lost or overwritten. The implementation includes a new tool-call schema and a sandbox (DSec) to accelerate RL rollouts, making it feasible to train and deploy agents that rely on long-horizon planning and frequent tool interactions, though benchmark saturation remains competitive rather than leading in all tasks.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info