Assignment: Post-Retrieval Processing#
Assignment Metadata#
Field |
Description |
|---|---|
Assignment Name |
Re-ranking with Cross-Encoder and Maximal Marginal Relevance |
Course |
RAG and Optimization |
Project Name |
|
Estimated Time |
90 minutes |
Framework |
Python 3.10+, LangChain, Sentence-Transformers, Cross-Encoder models |
Learning Objectives#
By completing this assignment, you will be able to:
Implement Cross-Encoder re-ranking to improve retrieval precision
Apply Maximal Marginal Relevance (MMR) to ensure result diversity
Compare Bi-Encoder and Cross-Encoder architectures for re-ranking
Configure the funnel strategy: retrieve many, re-rank few
Evaluate the trade-offs between relevance and diversity in retrieval
Problem Description#
Your RAG system retrieves the top-K documents using vector similarity. However, users report two issues:
Precision problems: Sometimes highly relevant documents are ranked lower than less relevant ones
Redundancy problems: Retrieved documents often contain duplicate or overlapping information
Your task is to implement Cross-Encoder re-ranking and MMR as post-retrieval processing steps.
Technical Requirements#
Environment Setup#
Python 3.10 or higher
Required packages:
langchain>= 0.1.0sentence-transformers>= 2.2.0chromadb>= 0.4.0numpy>= 1.24.0
Models#
Bi-Encoder:
sentence-transformers/all-MiniLM-L6-v2Cross-Encoder:
cross-encoder/ms-marco-MiniLM-L-6-v2
Tasks#
Task 1: Implement Cross-Encoder Re-ranking (35 points)#
Build a re-ranking pipeline that:
Takes top-50 results from Bi-Encoder retrieval
Scores each (query, document) pair using Cross-Encoder
Returns re-ranked top-K documents
Implement the funnel strategy:
Stage 1: Retrieve top-50 with Bi-Encoder (fast)
Stage 2: Re-rank to top-5 with Cross-Encoder (accurate)
Measure performance:
Re-ranking latency per query
Memory usage comparison (Bi-Encoder vs Cross-Encoder)
Task 2: Implement MMR (35 points)#
Implement the MMR algorithm:
MMR = argmax[λ * sim(doc, query) - (1-λ) * max(sim(doc, selected_docs))]
Start with the most relevant document
Iteratively select documents balancing relevance and diversity
Use configurable λ parameter (default: 0.5)
Test with different λ values:
λ = 1.0 (pure relevance, no diversity)
λ = 0.5 (balanced)
λ = 0.3 (prioritize diversity)
Create demonstration examples showing:
Without MMR: redundant information in top-5
With MMR: diverse information coverage
Task 3: Combined Pipeline and Evaluation (30 points)#
Build a combined post-retrieval pipeline:
Option A: Cross-Encoder first, then MMR
Option B: MMR first, then Cross-Encoder
Compare which order produces better results
Create a test set with 10 queries including:
Queries prone to redundant results (biographical, product features)
Queries requiring precise matching (technical, factual)
Evaluation metrics:
Query ID |
Baseline nDCG@5 |
Cross-Encoder nDCG@5 |
MMR Diversity Score |
Combined nDCG@5 |
|---|---|---|---|---|
Q1 |
||||
Q2 |
||||
… |
Submission Requirements#
Required Deliverables#
Source code (Jupyter notebook or Python scripts)
README.mdwith setup and usage instructionsPerformance benchmarks (latency, memory)
Evaluation results table
Example outputs showing before/after re-ranking and MMR
Submission Checklist#
Cross-Encoder re-ranking improves precision
MMR produces diverse result sets
Combined pipeline is properly implemented
Performance trade-offs are documented
Code includes clear comments and documentation
Evaluation Criteria#
Criteria |
Points |
|---|---|
Cross-Encoder implementation |
20 |
Funnel strategy implementation |
15 |
MMR algorithm correctness |
20 |
λ parameter experimentation |
10 |
Combined pipeline design |
15 |
Evaluation quality |
10 |
Code quality and documentation |
10 |
Total |
100 |
Hints#
Use
sentence_transformers.CrossEncoderfor easy re-ranking implementationFor MMR, cache document-document similarities to avoid recomputation
Consider batch processing for Cross-Encoder to improve throughput
Test your MMR implementation with a small set first (5-10 documents)
The diversity score can be computed as the average pairwise distance between selected documents