InfoCapability

Google Cloud Compute Engine C4 Emerald Rapids CPUs outperform N2 for embedding and generation workloads

AI Impact Summary

Benchmark compares Google Cloud Compute Engine N2 (Ice Lake) and C4 (Emerald Rapids with AMX) CPU instances using optimum-benchmark and optimum-intel to measure text embedding and text generation workloads. C4 delivers 10x-24x higher throughput for embedding and 2.3x-3.6x higher throughput for generation, with C4 hourly cost about 1.3x N2, yielding 7x-19x embedding and 1.7x-2.9x generation total cost of ownership in the tested ranges. This suggests CPU-only deployments of lightweight agentic AI stacks are viable at scale, but real-world results depend on model choice (e.g., WhereIsAI/UAE-Large-V1, meta-llama/Llama-3.2-3) and config.

Affected Systems

Google Cloud Compute Engine C4Google Cloud Compute Engine N2

Date: Date not specified
Change type: capability
Severity: info

Google Cloud Compute Engine C4 Emerald Rapids CPUs outperform N2 for embedding and generation workloads

More from Hugging Face

Get alerts for Hugging Face