InfoCapability

Enable finetuning of sparse embedding models with Sentence Transformers (SPLADE) using naver/splade-v3

AI Impact Summary

This change describes a workflow to fine-tune sparse embedding models (SPLADE-style) with Sentence Transformers to adapt encoders to domain data for retrieval and hybrid search. It demonstrates using SparseEncoder with the naver/splade-v3 model and sourcing pretrained encoders from the Hugging Face Hub, along with standard training components (datasets, losses, evaluators, trainer). The approach preserves interpretability by decoding top contributing tokens and shows how neural sparse expansion influences matching. Domain teams can achieve improved retrieval accuracy and clearer token-level explanations, but will need training data, compute, and governance to manage vocabulary expansion and model drift.

Affected Systems

Sentence Transformers librarySparseEncoder

Date: Date not specified
Change type: capability
Severity: info

Enable finetuning of sparse embedding models with Sentence Transformers (SPLADE) using naver/splade-v3

More from Hugging Face

Get alerts for Hugging Face