AprielGuard: 8B LLM Safety & Adversarial Guardrail
AI Impact Summary
AprielGuard is a novel 8B parameter model designed to proactively defend against emerging threats in modern LLM agent ecosystems. It operates across diverse input formats – standalone prompts, multi-turn conversations, and agentic workflows – and classifies both safety risks and adversarial attacks. The model’s architecture, built on a scaled-down Apriel-1.5 Thinker Base, combined with a synthetic training dataset generated using Mixtral-8x7B and NVIDIA NeMo Curator, provides a robust defense against a wide range of attacks, including prompt injection, jailbreaks, and memory manipulation.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info