InfoCapability

Vectara launches HHEM leaderboard using Hugging Face leaderboard template for hallucination evaluation

AI Impact Summary

The article describes an end-to-end process for building an open-source HHEM leaderboard using the Hugging Face leaderboard template, including custom backend models and datasets to track hallucination metrics. It emphasizes dynamic updates, model submissions, and deployment as a Hugging Face Space, illustrating how teams can democratize evaluation across both open-source and commercial models (e.g., Llama 2, Mistral 7B, GPT-4, Gemini, Claude). The content highlights code locations and workflow steps that a technical team would need to implement and maintain, signaling a reusable blueprint for governance-focused model evaluation. This approach enables repeatable benchmarking of hallucination propensity, supporting procurement decisions and transparency in model performance.

Affected Systems

Hugging Face Open LLM Leaderboard

Date: Date not specified
Change type: capability
Severity: info

Vectara launches HHEM leaderboard using Hugging Face leaderboard template for hallucination evaluation

More from Hugging Face

Get alerts for Hugging Face