Skip to main content
Version: 1.1.0 (Latest)

Rerank Models

Watsonx.ai Rerank Models provide document reranking capabilities that improve the relevance of search results by scoring and reordering documents based on their semantic relevance to a query. This is particularly useful in Retrieval-Augmented Generation (RAG) pipelines to ensure the most relevant documents are prioritized.

Overview

Document reranking is a critical step in modern search and RAG systems. While initial retrieval (e.g., vector similarity search) efficiently retrieves candidate documents, reranking uses more sophisticated models to precisely score each document's relevance to the query.

The watsonx.ai rerank integration provides:

  • Standalone Reranking: Direct API access for scoring and reordering documents
  • RAG Integration: DocumentPostProcessor implementation for seamless integration with Spring AI's RAG pipeline

Supported Models

The following rerank models are supported:

  • cross-encoder/ms-marco-minilm-l-12-v2 - A cross-encoder model trained on MS MARCO passage ranking dataset (default)
  • Other reranking models available in your Watsonx.ai instance

Configuration Properties

Rerank Properties

The prefix spring.ai.watsonx.ai.rerank is used as the property prefix for configuring the Watsonx.ai rerank model.

PropertyDescriptionDefault
spring.ai.watsonx.ai.rerank.enabledEnable or disable the rerank auto-configurationtrue
spring.ai.watsonx.ai.rerank.rerank-endpointThe rerank API endpoint/ml/v1/text/rerank
spring.ai.watsonx.ai.rerank.versionAPI version date in YYYY-MM-DD format2024-05-31
spring.ai.watsonx.ai.rerank.options.modelID of the model to use for rerankingcross-encoder/ms-marco-minilm-l-12-v2
spring.ai.watsonx.ai.rerank.options.top-nLimit results to top N documents-
spring.ai.watsonx.ai.rerank.options.truncate-input-tokensMaximum tokens before truncation512
spring.ai.watsonx.ai.rerank.options.return-inputsInclude original text in responsefalse
spring.ai.watsonx.ai.rerank.options.return-queryInclude query in responsefalse

Runtime Options

The WatsonxAiRerankOptions.java class provides options for configuring rerank requests.

On start-up, the options specified by spring.ai.watsonx.ai.rerank.options are used, but you can override these at runtime.

WatsonxAiRerankOptions options = WatsonxAiRerankOptions.builder()
.model("cross-encoder/ms-marco-minilm-l-12-v2")
.topN(5)
.truncateInputTokens(512)
.returnInputs(true)
.build();

List<RerankResult> results = rerankModel.rerank(query, documents, options);

WatsonxAiRerankOptions

The WatsonxAiRerankOptions class provides various options for configuring rerank requests:

OptionDefaultDescription
modelcross-encoder/ms-marco-minilm-l-12-v2The rerank model to use
topNnull (return all)Limit results to the top N highest-scoring documents
truncateInputTokens512Maximum number of tokens per input before truncation
returnInputsfalseWhether to include the original input text in the response
returnQueryfalseWhether to include the query in the response

Standalone Reranking

The WatsonxAiRerankModel provides direct access to the reranking API:

@RestController
public class RerankController {

private final WatsonxAiRerankModel rerankModel;

public RerankController(WatsonxAiRerankModel rerankModel) {
this.rerankModel = rerankModel;
}

@PostMapping("/ai/rerank")
public List<RerankResult> rerank(
@RequestParam String query,
@RequestBody List<String> documents) {

return rerankModel.rerank(query, documents);
}
}

Example Response

The rerank API returns results sorted by relevance score (descending):

List<String> documents = List.of(
"Machine learning is a subset of artificial intelligence.",
"Cooking Italian pasta requires fresh ingredients.",
"Deep learning uses neural networks with many layers."
);

List<RerankResult> results = rerankModel.rerank(
"What is machine learning?",
documents
);

// Results are sorted by score (highest first)
for (RerankResult result : results) {
System.out.println("Index: " + result.index() +
", Score: " + result.score());
}
// Output:
// Index: 0, Score: 0.95 (Machine learning document)
// Index: 2, Score: 0.82 (Deep learning document)
// Index: 1, Score: 0.12 (Cooking document)

RAG Integration with DocumentPostProcessor

The WatsonxAiDocumentReranker implements Spring AI's DocumentPostProcessor interface, enabling seamless integration with RAG pipelines.

Basic RAG Integration

@Configuration
public class RagConfiguration {

@Bean
public RetrievalAugmentationAdvisor retrievalAugmentationAdvisor(
VectorStore vectorStore,
WatsonxAiDocumentReranker documentReranker) {

return RetrievalAugmentationAdvisor.builder()
.documentRetriever(VectorStoreDocumentRetriever.builder()
.vectorStore(vectorStore)
.similarityThreshold(0.5)
.topK(20) // Retrieve more candidates for reranking
.build())
.documentPostProcessors(documentReranker) // Rerank the results
.build();
}
}

Using with ChatClient

@Service
public class RagService {

private final ChatClient chatClient;

public RagService(
ChatModel chatModel,
VectorStore vectorStore,
WatsonxAiDocumentReranker documentReranker) {

RetrievalAugmentationAdvisor advisor = RetrievalAugmentationAdvisor.builder()
.documentRetriever(VectorStoreDocumentRetriever.builder()
.vectorStore(vectorStore)
.topK(20)
.build())
.documentPostProcessors(documentReranker)
.build();

this.chatClient = ChatClient.builder(chatModel)
.defaultAdvisors(advisor)
.build();
}

public String ask(String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
}
}

Custom Reranker Configuration

You can create a reranker with custom options:

@Bean
public WatsonxAiDocumentReranker customDocumentReranker(
WatsonxAiRerankModel rerankModel) {

WatsonxAiRerankOptions options = WatsonxAiRerankOptions.builder()
.topN(5) // Return only top 5 documents
.truncateInputTokens(1024)
.build();

return new WatsonxAiDocumentReranker(rerankModel, options);
}

Accessing Rerank Scores

When documents are reranked, the rerank score is stored in the document's metadata:

// After reranking, documents have the score in metadata
List<Document> rerankedDocs = documentReranker.process(query, documents);

for (Document doc : rerankedDocs) {
Double rerankScore = (Double) doc.getMetadata()
.get(WatsonxAiDocumentReranker.RERANK_SCORE_METADATA_KEY);

System.out.println("Document: " + doc.getId() +
", Rerank Score: " + rerankScore);
}

For RAG integration, also add the Spring AI RAG dependency:

<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-rag</artifactId>
</dependency>