Transformers.js v3 adds WebGPU support, per-module dtypes, and 120 architectures
AI Impact Summary
Transformers.js v3 enables in-browser GPU acceleration by integrating WebGPU via ONNX Runtime Web, allowing pipelines such as feature-extraction (mixedbread-ai/mxbai-embed-xsmall-v1), automatic-speech-recognition (onnx-community/whisper-tiny.en), and image-classification (onnx-community/mobilenetv4_conv_small.e2400_r224_in1k) to run on the client GPU with device: 'webgpu'. It adds a broader quantization surface through the dtype option (fp32, fp16, q8/int8/uint8, q4, bnb4, q4f16) and per-module dtypes for encoder/decoder components, enabling finer-grained tradeoffs between size and accuracy. The release expands support to 120 architectures, including Florence-2, Gemma, Depth Pro, and more, dramatically widening browser-based ML use cases while noting WebGPU support is around 70% globally as of October 2024 and may require feature flags or fallbacks for some users.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info