InfoCapability

Open LLM Leaderboard: CO₂ emissions insights across community vs official model fine-tunes

AI Impact Summary

Open LLM Leaderboard now reports CO₂ emissions per inference using a fixed hardware and workflow (8 GPUs/node, Transformers with Accelerate), enabling apples-to-apples comparisons across 2,742 models including Gemma/Gemma2, Llama, Mistral, Mixtral, Phi/Phi3, and Qwen families. The data confirms that larger base models incur higher emissions, but the rank-to-emission relationship is not strictly proportional; notably, official fine-tunes often consume more energy than base models, while community fine-tunes can achieve similar scores with substantially lower CO₂ in several cases. This implies that production teams can improve sustainability by favoring community-tuned or smaller models when acceptable performance is achieved, and by benchmarking energy per task on their own hardware since CO₂ estimates are hardware-specific.

Affected Systems

Open LLM Leaderboard

Date: Date not specified
Change type: capability
Severity: info

Open LLM Leaderboard: CO₂ emissions insights across community vs official model fine-tunes

More from Hugging Face

Get alerts for Hugging Face