Assignment: Post-Retrieval Processing#

Assignment Metadata#

Field	Description
Assignment Name	Re-ranking with Cross-Encoder and Maximal Marginal Relevance
Course	RAG and Optimization
Project Name	`post-retrieval-rag`
Estimated Time	90 minutes
Framework	Python 3.10+, LangChain, Sentence-Transformers, Cross-Encoder models

By completing this assignment, you will be able to:

Your RAG system retrieves the top-K documents using vector similarity. However, users report two issues:

Precision problems: Sometimes highly relevant documents are ranked lower than less relevant ones
Redundancy problems: Retrieved documents often contain duplicate or overlapping information

Your task is to implement Cross-Encoder re-ranking and MMR as post-retrieval processing steps.

Python 3.10 or higher
Required packages:
- langchain >= 0.1.0
- sentence-transformers >= 2.2.0
- chromadb >= 0.4.0
- numpy >= 1.24.0

Build a re-ranking pipeline that:
- Takes top-50 results from Bi-Encoder retrieval
- Scores each (query, document) pair using Cross-Encoder
- Returns re-ranked top-K documents
Implement the funnel strategy:
- Stage 1: Retrieve top-50 with Bi-Encoder (fast)
- Stage 2: Re-rank to top-5 with Cross-Encoder (accurate)
Measure performance:
- Re-ranking latency per query
- Memory usage comparison (Bi-Encoder vs Cross-Encoder)

Implement the MMR algorithm:
```
MMR = argmax[λ * sim(doc, query) - (1-λ) * max(sim(doc, selected_docs))]
```
- Start with the most relevant document
- Iteratively select documents balancing relevance and diversity
- Use configurable λ parameter (default: 0.5)
Test with different λ values:
- λ = 1.0 (pure relevance, no diversity)
- λ = 0.5 (balanced)
- λ = 0.3 (prioritize diversity)
Create demonstration examples showing:
- Without MMR: redundant information in top-5
- With MMR: diverse information coverage

Build a combined post-retrieval pipeline:
- Option A: Cross-Encoder first, then MMR
- Option B: MMR first, then Cross-Encoder
- Compare which order produces better results
Create a test set with 10 queries including:
- Queries prone to redundant results (biographical, product features)
- Queries requiring precise matching (technical, factual)
Evaluation metrics:

Criteria	Points
Cross-Encoder implementation	20
Funnel strategy implementation	15
MMR algorithm correctness	20
λ parameter experimentation	10
Combined pipeline design	15
Evaluation quality	10
Code quality and documentation	10
Total	100

Use sentence_transformers.CrossEncoder for easy re-ranking implementation
For MMR, cache document-document similarities to avoid recomputation
Consider batch processing for Cross-Encoder to improve throughput
Test your MMR implementation with a small set first (5-10 documents)
The diversity score can be computed as the average pairwise distance between selected documents