Open Leaderboard for Hebrew LLMs launched — benchmarks via HuggingFace Inference Endpoints
AI Impact Summary
Open Leaderboard for Hebrew LLMs introduces a public evaluation platform tailored to Hebrew, a morphologically rich, low-resource language. The project benchmarks models on four Hebrew tasks (Hebrew Question Answering via the HeQ dataset, Sentiment Accuracy, Winograd Schema Challenge, and Translation) using few-shot prompts, with deployments handled by HuggingFace Inference Endpoints and evaluation orchestrated by the lighteval library. By leveraging the Open LLM Leaderboard framework and HF Space, it enables a community-driven process for submitting and comparing Hebrew models, accelerating progress and highlighting linguistic gaps. This setup lowers the barrier to robust Hebrew model evaluation, guiding where research and funding should focus to improve Hebrew NLP capabilities.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info