InfoCapability

AprielGuard: 8B LLM Safety & Adversarial Guardrail

AI Impact Summary

AprielGuard is a novel 8B parameter model designed to proactively defend against emerging threats in modern LLM agent ecosystems. It operates across diverse input formats – standalone prompts, multi-turn conversations, and agentic workflows – and classifies both safety risks and adversarial attacks. The model’s architecture, built on a scaled-down Apriel-1.5 Thinker Base, combined with a synthetic training dataset generated using Mixtral-8x7B and NVIDIA NeMo Curator, provides a robust defense against a wide range of attacks, including prompt injection, jailbreaks, and memory manipulation.

Affected Systems

Apriel-1.5 Thinker BaseMixtral-8x7B

Date: Date not specified
Change type: capability
Severity: info

AprielGuard: 8B LLM Safety & Adversarial Guardrail

More from Hugging Face

Get alerts for Hugging Face