Exam Theory: RAG and Optimization#
This exam theory focuses on assessing advanced topics within Retrieval-Augmented Generation (RAG) and its optimization techniques, drawing specifically from Advanced Indexing, Hybrid Search, Query Transformation, Post-Retrieval Processing, and GraphRAG Implementations.
No. |
Training Unit |
Lecture |
Training content |
Question |
Level |
Mark |
Answer |
Answer Option A |
Answer Option B |
Answer Option C |
Answer Option D |
Explanation |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What is a major disadvantage of fixed-size chunking when applied to large amounts of documents? |
Easy |
1 |
A |
It causes a loss of semantics by breaking ideas arbitrarily. |
It is too computationally expensive. |
It prevents vector search from indexing numbers. |
It requires advanced linguistic models to parse. |
Mechanical chunking accidentally breaks the flow of the text, making the LLM unable to understand the context when an idea is arbitrarily split. |
2 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
Why does Brute-force Flat Indexing become a serious problem as a system scales? |
Easy |
1 |
B |
It consumes too much disk space. |
It causes high latency when sequentially scanning millions of vectors. |
It is incompatible with neural network architectures. |
It only supports English text. |
Sequentially scanning through millions of vectors in a Flat Index is too slow to meet real-time requirements. |
3 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What is the core idea driving Semantic Chunking? |
Medium |
1 |
C |
To chunk text strictly by paragraph breaks. |
To split texts after exactly 1000 characters. |
To detect shifts to a new topic and perform a break precisely at the intersection of two topics. |
To summarize the text before splitting it. |
Semantic Chunking detects when sentences or content shift to a new topic (when vector direction abruptly changes) to perform a break. |
4 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What metric is typically calculated between consecutive sentences during Semantic Chunking? |
Medium |
1 |
A |
Cosine similarity |
Word count ratio |
Token frequency |
Character limits |
In Semantic Chunking, the similarity (for example cosine similarity) is calculated between the current sentence and the next one. |
5 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
In Semantic Chunking, when does the algorithm decide to split the text? |
Medium |
1 |
D |
When similarity is above 90%. |
After a fixed number of punctuation marks. |
When the sentence length exceeds the threshold. |
When similarity drops significantly below a threshold. |
If similarity drops significantly below the threshold, it means the topic has changed, breaking the chunk there. |
6 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What is a notable advantage of Semantic Chunking over Recursive Chunking? |
Medium |
1 |
B |
It runs extremely fast. |
It preserves ideas fully and perfectly follows the flow of text. |
It does not consume any computational resources. |
It is specifically designed for codebases. |
Semantic Chunking preserves ideas fully, strictly follows the text flow, and increases accuracy when searching. |
7 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What is a major disadvantage of Semantic Chunking? |
Easy |
1 |
C |
It cuts through important ideas frequently. |
It returns very noisy contexts. |
It consumes computational resources due to running a model to compare each sentence. |
It only works for legal or contract documents. |
Because it must run an ML model to compare the similarity of each consecutive sentence, it consumes computational resources. |
8 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What does HNSW stand for in the context of Vector Databases? |
Easy |
1 |
A |
Hierarchical Navigable Small World |
High Neural State Weights |
Heuristic Node Searching Window |
Hierarchical Numeric Sequence Word |
HNSW stands for Hierarchical Navigable Small World, an effective algorithm balancing retrieval speed and accuracy. |
9 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What kind of data structure does HNSW organize data into? |
Medium |
1 |
C |
A flat SQL table |
A chronological file system |
A multi-layered graph structure |
A raw byte stream |
HNSW organizes data in the form of a multi-layered graph structure utilizing short and long shortcut links. |
10 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
In HNSW, what is the role of Layer 0? |
Medium |
1 |
D |
It contains the shortest summary of the dataset. |
It stores the sparse shortcut links. |
It is empty and serves as a placeholder. |
It contains all data points and the most detailed links between them. |
Layer 0 contains all data points, and the most detailed links. It contains the most complete information to find the exact target. |
11 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What does parameter |
Hard |
1 |
A |
The maximum number of links a node can create with neighbor nodes. |
The memory limit in megabytes. |
The number of documents returned. |
The margin of error allowed. |
M specifies the maximum number of links a node can create with other neighbor nodes. The larger M is, the denser the network. |
12 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
How should |
Hard |
1 |
B |
It should be set to 0. |
It should be kept at a low level (e.g., 50-100) to optimize latency. |
It should be set to maximum allowed bounds. |
It should equal the total number of documents. |
Keeping |
13 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What is an inherent weakness of standard Vector Search? |
Easy |
1 |
C |
It lacks speed when processing basic synonyms. |
It struggles with multilingual queries. |
It reveals weaknesses when encountering queries requiring absolute accuracy in wording. |
It ignores document meaning entirely. |
Vector Search reveals weaknesses when processing queries requiring absolute accuracy (e.g., proper names, error codes). |
14 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What exactly constitutes a Hybrid Search mechanism? |
Easy |
1 |
A |
Combining the power of semantic vector search with traditional keyword search. |
Merging structured and unstructured relational databases. |
Running two identical LLMs simultaneously. |
Compiling queries in both Python and Java. |
Hybrid search combines semantic search (Vector) and traditional keyword search (BM25). |
15 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
Which keyword frequency-based statistical algorithm is standard for Hybrid Search? |
Easy |
1 |
D |
BERT |
HNSW |
HyDE |
BM25 |
BM25 is the gold standard for traditional keyword retrieval algorithms in Hybrid Search. |
16 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
How does BM25 solve the keyword spamming problem found in TF-IDF? |
Medium |
1 |
B |
By manually blacklisting frequent spammers. |
By applying a saturation mechanism where scoring asymptotes after several keyword occurrences. |
By analyzing the semantic meaning of repetitive words. |
By deleting any document that repeats a word. |
BM25 applies a saturation mechanism so that appearing a 101st time hardly adds more score than the 10th time. |
17 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What does Inverse Document Frequency (IDF) do in the BM25 formula? |
Medium |
1 |
A |
It penalizes common words and massively rewards rare words. |
It ranks shorter documents higher than longer ones. |
It limits the number of query words sent to the server. |
It inverses the vectors created by the model. |
IDF penalizes common words heavily while attributing more importance and score weight to rare words. |
18 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
Why is Length Normalization an important feature of BM25? |
Medium |
1 |
C |
It forces all documents to be exactly 1000 characters. |
It compresses long queries to save bandwidth. |
A single keyword in a short paragraph gets rated higher than the same keyword diluted in a long novel. |
It converts all characters to lowercase. |
BM25 scales the score based on document length to prevent long documents from unfairly dominating over concise information. |
19 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
In a typical Hybrid Search pipeline, how are the two algorithms executed? |
Medium |
1 |
D |
Vector search completes first, then BM25 is run on the results. |
BM25 runs entirely locally before running Vector remotely. |
Only one is executed depending on a query classifier. |
They are executed in parallel simultaneously. |
The system sends the query simultaneously to both search engines (Parallel Execution). |
20 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
Why canât we simply add the BM25 score and the Vector Search score together? |
Hard |
1 |
B |
Vector search scores are negative integers. |
The scoring scales are fundamentally different (Vector uses [0, 1] cosine similarity; BM25 is arbitrary positive numbers). |
They are processed on different neural network architectures. |
BM25 produces alphabetical grading ranges. |
The scoring scales of the two algorithms are completely different and numerically incompatible directly. |
21 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What algorithm solves the score compatibility issue in Hybrid Search? |
Medium |
1 |
C |
GraphRAG Convolution |
Maximal Marginal Relevance |
Reciprocal Rank Fusion (RRF) |
TF-IDF Smoothing |
Reciprocal Rank Fusion (RRF) merges these two lists effectively. |
22 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
Upon what theoretical basis does Reciprocal Rank Fusion (RRF) operate? |
Hard |
1 |
A |
Instead of scores, it assumes that if a document appears at a high rank in both lists, it is certainly important. |
It averages the raw text chunks of both documents. |
It only accounts for the longest document. |
It uses an LLM to assign arbitrary ranks. |
RRF cares about rank rather than score; a high consensus of rank across disparate algorithms signifies an important document. |
23 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What is the purpose of the smoothing constant |
Hard |
1 |
D |
It identifies the number of total documents in the database. |
It sets the maximum allowed token count. |
It determines the strictness of exact keyword matching. |
It helps reduce score disparity between very high ranks, ensuring fairness. |
The constant \(k\) (usually 60) reduces massive score disparities between adjacent high ranks (like Top 1 vs Top 2), ensuring a smoother gradient of rank scoring. |
24 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What does Hybrid Search primarily sacrifice to gain balanced Context and Keyword accuracy? |
Easy |
1 |
B |
Security and Privacy |
System resources, as it is complex to deploy and consumes resources running 2 parallel streams. |
API documentation clarity |
Multi-lingual support |
Hybrid Search is more complex to deploy and consumes more resources due to running parallel streams simultaneously. |
25 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
Why do raw user questions often yield poor Vector Search results natively? |
Easy |
1 |
C |
LLMs cannot read unformatted text. |
Vector databases reject single words. |
Questions are short/interrogative, lacking context compared to long descriptive documents. |
Search algorithms intentionally delay short queries. |
Vector Search faces semantic asymmetry; questions are short and interrogative while documents are long and descriptive. |
26 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
What is the core idea of Query Transformation? |
Easy |
1 |
A |
Using an LLM to rewrite, expand, or break down the userâs question into better versions before searching. |
Encrypting user queries before transmission. |
Replacing semantic searches with strict SQL SELECT queries. |
Running the userâs prompt through a grammar checker. |
It uses an LLM to intelligently edit, expand, or rewrite poor raw queries before sending them to the lookup department. |
27 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
What does HyDE stand for in Query Transformation? |
Medium |
1 |
B |
Heavy Yield Database Execution |
Hypothetical Document Embeddings |
Hybrid Y-axis Dense Encapsulation |
Hex-layered Data Encryption |
HyDE stands for Hypothetical Document Embeddings. |
28 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
What happens during the âGenerateâ phase of a HyDE strategy? |
Medium |
1 |
D |
It generates Python scripts. |
It generates a dense vector representing the question. |
It generates an index mapping inside the SQL table. |
The system asks the LLM to write a hypothetical answer paragraph for the userâs question. |
The LLM is forced to draft a fake, hypothetical answer for the question so it matches the expected document vocabulary. |
29 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
Does the hypothetical âfake answerâ drafted in HyDE need to be factually correct? |
Hard |
1 |
B |
Yes, exact factual accuracy guarantees precise matches. |
No, but the writing style and technical vocabulary should resemble the actual document. |
Yes, the model refuses to output hallucinated responses. |
No, it just generates a sequence of random numbers. |
The information in the paragraph might be factually incorrect, but its style and technical vocabulary mimic real documents to enable better semantic matching. |
30 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
Why is the vector generated from the âfake answerâ in HyDE more useful than the userâs question vector? |
Medium |
1 |
A |
The fake answer vector is semantically closer to the real document vector than the short interrogative question vector. |
It consumes 0 RAM. |
It maps perfectly to sparse BM25 arrays. |
The userâs query vector is permanently deleted. |
The drafted answer contains similar sentence structures/buzzwords to real documents, closing the asymmetric semantic gap. |
31 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
When is the Query Decomposition strategy particularly useful? |
Medium |
1 |
C |
When querying single words. |
When parsing simple FAQ menus. |
When a question requires comparing or aggregating information from multiple independent scattered sources. |
When reading codebases in completely unknown programming languages. |
It handles complex multi-intent questions comparing or gathering data from multiple sources where a single text snippet fails to contain the whole answer. |
32 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
What happens during the first phase (Breakdown) of Query Decomposition? |
Medium |
1 |
A |
The LLM analyzes the original question and splits it into a sequence of separate independent sub-questions. |
The system shreds the database documents into chunks. |
The LLM provides the final answer immediately without searching. |
The database is partitioned across multiple distinct servers. |
The system identifies multi-intent questions and logically breaks them into single-intent targeted sub-questions. |
33 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
How does Query Decomposition run searches for multiple sub-questions? |
Medium |
1 |
B |
It merges all sub-questions back into one query. |
It performs standard document searches individually for each separate sub-question. |
It relies exclusively on cached external queries. |
It skips queries containing conjunctions. |
It executes distinct targeted retrieval queries for every identified independent sub-question. |
34 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
Which phase of Query Decomposition requires the LLM to process text found from all separate sub-searches? |
Easy |
1 |
C |
Breakdown |
Encapsulation |
Synthesis |
Verification |
In Synthesis, text segments found from all previous distinct steps are aggregated and fed into the LLM to form a complete final answer. |
35 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
In summary, what role does Query Transformation act as? |
Easy |
1 |
D |
An internet firewall proxy. |
A database administrator deleting old records. |
A compiler translating queries to binary. |
An intelligent editor reorienting questions to ensure the system correctly understands true intent. |
It performs intelligent preprocessing (via drafting or splitting) so concise or poor user queries execute properly against the technical index. |
36 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
Why is the Top-K list returned directly from standard retrievers often suboptimal for an LLM? |
Medium |
1 |
A |
Standard embedding models trade deep semantic accuracy for retrieval speed, and may return contextually incorrect ânoisyâ keyword matches. |
The returned list is usually empty. |
The standard top-K size is too large for modern hardware. |
The returned documents are always translated to a random language. |
Embedding models heavily prioritize index speed over complex relationship comprehension, often returning documents with matching keywords but wrong contextual intents. |
37 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What represents the main goal of Re-ranking in a RAG pipeline? |
Easy |
1 |
C |
To randomly shuffle the document list. |
To format the output HTML for the frontend. |
To act as a final filter processing a small pool of candidates to pick the absolutely best ones. |
To permanently alter the dataset ordering. |
Re-ranking takes a small pool (like 50) and spends extra computational time reading them carefully to pick the top 5 highest-quality documents. |
38 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What architectural method do standard Embedding Models use during the Retrieval step? |
Medium |
1 |
B |
Graph-Encoder |
Bi-Encoder |
Cross-Encoder |
Recursive-Encoder |
Retrieval embeddings process questions and documents separately via Bi-Encoders. |
39 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What is the major pros and cons of the Bi-Encoder architecture? |
Hard |
1 |
A |
Fast speed (via pre-computation), but loses detailed nuanced interaction information between question and document words. |
Extreme accuracy, but consumes too much API quota. |
Perfectly handles complex negations, but fails at simple keywords. |
It guarantees data privacy, but prevents external web searches. |
Because the vectors are calculated independently ahead of time, it runs fast but misses deeper interrelated context (like negations vs subjects). |
40 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
How does a Cross-Encoder fundamentally differ from a Bi-Encoder? |
Hard |
1 |
D |
It translates everything into Spanish. |
It maps vectors onto a graph database exclusively. |
It bypasses the attention mechanism entirely. |
The question and document are concatenated into a single text sequence, processed simultaneously via a full Self-Attention mechanism. |
Instead of separated outputs, Cross-Encoders read both strings concurrently to understand complex logic, negation, and interactions between all words simultaneously. |
41 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
If Cross-Encoders are incredibly accurate, why donât we use them to search the entire database? |
Medium |
1 |
C |
They cannot run on GPUs. |
They only output integers. |
They are very slow and resource-consuming to run across millions of documents. |
They are blocked by vector database protocols. |
Processing millions of documents concurrently through strict Self-Attention is too computationally slow. |
42 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What describes the Funnel Strategy in Post-Retrieval? |
Medium |
1 |
B |
Running Bi-Encoder and Cross-Encoder on separate clusters entirely. |
Using Bi-Encoder to fast-retrieve a Top 50, then using Cross-Encoder to slowly re-score those 50 into a Top 5. |
Splitting documents into smaller funnels based on character limits. |
Re-ranking the vector database before queries arrive. |
The funnel strategy accepts speed from Bi-Encoders (for finding 50 items) and precision from Cross-Encoders (for filtering to 5). |
43 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
In scenarios dealing with biological negation (e.g., âWhat does Python NOT eatâ), why does a Cross-Encoder succeed where a Bi-Encoder fails? |
Hard |
1 |
A |
The Cross-Encoder recognizes the negation structure and biological context perfectly since it reads the query and document concurrently. |
The Cross-Encoder has a specialized biology database pre-installed. |
The Bi-Encoder deletes the word âNOTâ. |
The Cross-Encoder ignores keywords entirely. |
Bi-Encoders mistakenly link the keywords âPythonâ and âeatâ, while Cross-Encoders accurately recognize the negation modifier mapping to the biological logic. |
44 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What does MMR stand for in the context of Post-Retrieval processing? |
Medium |
1 |
D |
Minimum Marginal Rating |
Multi-Model Retrieval |
Memory Mapping Resolution |
Maximal Marginal Relevance |
MMR stands for Maximal Marginal Relevance, an algorithm used to diversify query results. |
45 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What twofold problem does MMR aim to solve when selecting final documents? |
Medium |
1 |
B |
Size vs Compression |
Relevance to the query vs Diversity to prevent identical redundant documents. |
API Latency vs Local Storage |
Token allowance vs Security constraints |
When similarity returns 5 identical paragraphs of text, MMR resolves the redundancy by ensuring selected documents are relevant but distinctly diverse. |
46 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
In the MMR algorithm, what occurs after picking the most similar document (Step 1)? |
Hard |
1 |
C |
The system clears the cache. |
The system returns immediately. |
It finds the next document similar to the query but least similar to previously selected documents. |
It picks the document that is completely irrelevant to the query. |
Step 2 balances relevance by filtering for the next document containing the queryâs answer but differing heavily from the document already selected. |
47 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
In the MMR optimization formula, what does lowering lambda (\(\lambda\)) do? |
Hard |
1 |
A |
Priorities diversity by increasing the penalty for selecting text similar to existing selected documents. |
Causes the system to crash. |
Forces exact keyword matching. |
Elevates relevance entirely over diversity. |
Decreasing lambda gives more mathematical priority to the diversity penalty section of the MMR formula, forcing varied information. |
48 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
If a user asks a broad question (âFeatures of VF8 Carâ) and wants comprehensive overall coverage, which Re-ranker is optimal? |
Medium |
1 |
C |
Flat Indexing |
Recursive Chunking |
Maximal Marginal Relevance (MMR) |
Simple Bi-Encoder similarity |
MMR guarantees diverse, non-redundant documents giving the LLM text detailing multiple broad vehicle features, not just repeated text about its engine. |
49 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What does GraphRAG combine to create a comprehensive knowledge representation system? |
Easy |
1 |
B |
Cloud storage and Edge devices |
Structured graph databases with vector-based retrieval |
Dense and Sparse chunking limits |
Hybrid APIs and NoSQL mappings |
GraphRAG merges structured graph DBs (like Neo4j) and vector retrieval. |
50 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What popular graph database is used for storing GraphRAG entities in the implementation example? |
Easy |
1 |
A |
Neo4j |
PostgreSQL |
ElasticSearch |
MongoDB |
Neo4j is utilized to construct and store the nodes and relationship graphs. |
51 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What is the purpose of Pydantic models in the implementation pipeline? |
Medium |
1 |
D |
To render the Neo4j visualization frontend. |
To manage API timeout failures. |
To download PDF files correctly. |
To enforce validation schemas for structured entity/relationship output from the LLM. |
Pydantic classes like |
52 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
According to the implementation extraction rules, what constitutes a âcommitmentâ? |
Medium |
1 |
C |
Simple definitions and jargon. |
Any sentence ending in a period. |
A clear promise, obligation, or prohibition found in the text. |
A numeric calculation executed by the CPU. |
The LLM is instructed to identify clear promises, obligations, or prohibitions as Commitments. |
53 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
How are measurable numeric limits inside obligations handled during extraction? |
Hard |
1 |
D |
They are discarded mathematically. |
They are summed together. |
They are sent to a calculator API. |
They are explicitly extracted as Constraint unit parameters. |
If a commitment contains numeric limits, the agent extracts them strictly as linked Constraints. |
54 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What does the |
Medium |
1 |
A |
Forces the LLM to reply via JSON adhering precisely to the Pydantic schema class. |
Translates the output into Neo4j graph visualizations natively. |
Prevents the model from reading files. |
Outputs Python code running in a sandbox. |
It guarantees the unstructured text processed by the ChatGPT API is accurately deserialized back into structured |
55 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
In the designed graph schema, what do |
Easy |
1 |
C |
The user identities processing the data. |
The hardware metrics. |
The overarching policy topics/units from chunked texts. |
The exact numeric values from commitments. |
PolicyClause nodes store the actual chunked policy texts/topics serving as central nodes linking other entities. |
56 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
In Cypher (Neo4j), which operation ensures duplicate nodes are not created during ingestion? |
Medium |
1 |
B |
INSERT IGNORE |
MERGE |
UPSERT |
ADD DISTINCT |
Using the |
57 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
How are |
Hard |
1 |
A |
Via the |
Via a standalone |
Via |
They are completely unlinked. |
Stakeholder nodes reflect affected parties, mapped using |
58 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What represents a distinct advantage of GraphRAG over standard vector similarity search? |
Medium |
1 |
B |
It consumes zero system memory. |
Relationships explicitly define how entities connect, solving queries needing context-aware traversal mapping. |
It requires no chunking. |
It automatically resolves grammatical mistakes. |
Graph traversal natively exposes how discrete entities explicitly connect, answering intricate logical queries that vector distances alone cannot deduce. |
59 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
Which LangChain module converts natural language into Cypher queries for the LLM? |
Medium |
1 |
A |
GraphCypherQAChain |
VectorDBQAChain |
PydanticOutputParser |
DocumentConverter |
|
60 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What is noted as a core limitation or consideration when implementing GraphRAG? |
Medium |
1 |
D |
It deletes all prior indexes upon restart. |
It requires user authentication before every search. |
The LLM must be hosted locally. |
It relies heavily on specific types of structured data linking to form an effective knowledge base. |
GraphRAGâs power originates strictly from highly structured data mappings; mapping unstructured erratic data yields poor relationships. |