AI Theory Exams#
This page consolidates theory exam question banks from all AI training modules.
AI Fundamentals Theory#
Basic AI Fundamentals Quiz#
No. |
Training Unit |
Lecture |
Training content |
Question |
Level |
Mark |
Answer |
Answer Option A |
Answer Option B |
Answer Option C |
Answer Option D |
Explanation |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 |
Unit 1: Basic AI Fundamentals |
Lec1 |
RAG Architecture |
What is RAG (Retrieval-Augmented Generation), a hybrid AI architecture, designed to do? |
Medium |
1 |
C |
Increase the speed of natural language processing |
Reduce the cost of training language models |
Enhance the quality and reliability of Large Language Models |
Increase the creativity of language models |
RAG is designed to enhance the quality and reliability of Large Language Models (LLMs) by integrating an information retrieval step from an external knowledge base before the LLM generates text. |
2 |
Unit 1: Basic AI Fundamentals |
Lec1 |
RAG Core Problems |
What is one of the core technical problems that RAG solves? |
Easy |
1 |
A |
Reduce hallucination (making up information) |
Improve data retrieval speed |
Increase data storage capacity |
Enhance information security |
RAG addresses limitations of traditional LLMs such as hallucination, outdated knowledge, lack of transparency, and difficulty accessing specialized knowledge. |
3 |
Unit 1: Basic AI Fundamentals |
Lec1 |
RAG vs Fine-tuning |
What is the advantage of RAG over fine-tuning when updating knowledge for LLMs? |
Medium |
1 |
D |
RAG is only suitable for unstructured data |
RAG requires greater computing resources |
RAG has lower transparency |
RAG allows faster knowledge updates |
RAG allows quick and nearly instant knowledge updates by updating the vector database, while fine-tuning requires retraining the model, which is expensive and slower. |
4 |
Unit 1: Basic AI Fundamentals |
Lec1 |
RAG Use Cases |
When should you choose RAG instead of fine-tuning an LLM? |
Medium |
1 |
A |
When you need to add factual knowledge and answer questions based on new data |
When you need to reduce model operating costs |
When you need to enhance the modelâs reasoning ability |
When you need to adjust the modelâs behavior and style |
RAG is suitable when you need to add factual knowledge and answer questions based on new data, while fine-tuning is appropriate when you need to adjust behavior, style, or learn a new skill. |
5 |
Unit 1: Basic AI Fundamentals |
Lec1 |
RAG Pipeline |
In the RAG architecture, which phase occurs once or periodically to prepare data? |
Easy |
1 |
D |
Query vectorization phase |
Similarity search phase |
Retrieval and answer generation phase (Retrieval Generation Online) |
Data indexing phase (Indexing Offline) |
The Data Indexing phase (Indexing Offline) occurs once or periodically to prepare data for RAG. |
6 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Chunking |
What is the purpose of dividing data into smaller text chunks in the âLoad and Chunkâ step? |
Easy |
1 |
A |
To ensure semantics are not lost and optimize for searching |
To simplify the vectorization process |
To reduce the storage capacity of data |
To speed up data loading into the system |
Chunking ensures that semantics are not lost and optimizes for searching. |
7 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Vector Similarity |
What is the most common method for measuring similarity between query vectors and document vectors in a Vector Database? |
Medium |
1 |
C |
Manhattan distance |
Jaccard similarity |
Cosine Similarity |
Euclidean distance |
Cosine Similarity is the most common method for measuring the cosine angle between two vectors. |
8 |
Unit 1: Basic AI Fundamentals |
Lec1 |
RAG Online Phase |
What happens to the userâs question in the first step of the âRetrieval and Answer Generationâ phase? |
Easy |
1 |
D |
The question is stored in the database |
The question is divided into smaller chunks |
The question is translated to another language |
The question is vectorized using an Embedding model |
The userâs question is vectorized using an Embedding model. |
9 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Embedding Quality |
The quality of which component directly affects the effectiveness of the entire RAG system? |
Medium |
1 |
A |
Embedding model |
Similarity search method |
Vector database |
Prompting technique |
The quality of the Embedding model directly affects the effectiveness of the entire system. |
10 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Softmax Function |
In the LLM model, what is the role of the Softmax function? |
Hard |
1 |
A |
Convert scores (logits) into a probability distribution to select the most likely word |
Filter out irrelevant sentences or information in text chunks |
Calculate scores (logits) for all words in the vocabulary |
Search for suitable text chunks |
The Softmax function converts scores (logits) into a probability distribution, helping the model select the most likely word to appear. |
11 |
Unit 1: Basic AI Fundamentals |
Lec1 |
HyDE Technique |
What is the HyDE (Hypothetical Document Embeddings) technique used for? |
Hard |
1 |
A |
Expand the input query to improve retrieval results |
Re-evaluate the relevance of each (question, chunk) pair |
Filter out irrelevant information in text chunks |
Combine the power of keyword search and vector search |
HyDE uses a small LLM to generate a hypothetical document containing the answer, then uses this documentâs vector for searching, improving retrieval results. |
12 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Hybrid Search |
What is Hybrid Search? |
Medium |
1 |
A |
A method that combines the power of keyword search and vector search |
A method that re-evaluates the relevance of each (question, chunk) pair |
A method that transforms questions to improve retrieval results |
A method that compresses context before putting it into the prompt |
Hybrid Search combines keyword search (e.g., BM25) and vector search to achieve more comprehensive results. |
13 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Context Compression |
What is the purpose of Context Compression? |
Medium |
1 |
D |
Rearrange potential candidates to select the top quality chunks |
Transform input questions to improve retrieval results |
Improve the accuracy of information retrieval |
Reduce prompt length and help LLM focus on core information |
Context Compression helps reduce prompt length and helps the LLM focus on core information by filtering out irrelevant information. |
14 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Re-ranker |
What is the role of a Re-ranker in the RAG process? |
Medium |
1 |
C |
Compress text chunks to reduce prompt length |
Transform the original question to improve retrieval results |
Re-evaluate the relevance of each (question, chunk) pair and reorder them |
Search for text chunks based on keywords |
Re-ranker re-evaluates the relevance of each (question, chunk) pair and reorders them to select the top quality chunks. |
15 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Retriever Failure |
What happens if the retrieval system (retriever) does not find accurate documents in the RAG system? |
Medium |
1 |
B |
The system will automatically adjust retrieval parameters to find more suitable documents |
The Large Language Model (LLM) cannot answer correctly |
The Large Language Model (LLM) will search for information from external sources to compensate for missing data |
The Large Language Model (LLM) can still generate accurate answers based on prior knowledge |
If the retriever does not find the correct documents, no matter how smart the LLM is, it cannot answer correctly. |
16 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Lost in the Middle |
What does the âLost in the Middleâ syndrome in RAG systems refer to? |
Hard |
1 |
A |
The tendency of LLMs to focus on information at the beginning and end of long contexts, ignoring information in the middle |
Text chunks having duplicate information in the middle, causing noise in processing |
Difficulty integrating LLMs in the middle of the retrieval and generation process |
Delays in information retrieval when relevant documents are in the middle position in the database |
When prompts contain long contexts, LLMs tend to focus only on information at the beginning and end, easily ignoring important details in the middle. |
17 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Faithfulness Evaluation |
What does âFaithfulnessâ evaluation in RAG systems measure? |
Medium |
1 |
A |
The degree to which the generated answer adheres to the provided context |
The speed of processing and generating answers by the system |
The relevance of the answer to the userâs question |
The systemâs ability to retrieve information from different sources |
Faithfulness measures the degree to which the generated answer adheres to the provided context. Does the system add information on its own? |
18 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Attention Mechanism |
What role does the Attention Mechanism play in the Transformer architecture of RAG systems? |
Hard |
1 |
C |
Improve the modelâs parallel processing capability, helping to speed up computation |
Reduce dependence on fully connected layers in the model |
Allow the model to weigh the importance of different words in the input sequence for deep context understanding |
Enhance the ability to encode input information into semantic vectors |
The Attention Mechanism allows the model to weigh the importance of different words in the input sequence for deep context understanding. |
19 |
Unit 1: Basic AI Fundamentals |
Lec1 |
MRR Metric |
What does the Mean Reciprocal Rank (MRR) metric measure in Retrieval Evaluation? |
Hard |
1 |
C |
Measure the systemâs ability to synthesize information from different sources |
Measure the relevance between the question and the generated answer |
Measure the position of the first correct chunk in the returned result list |
Measure the percentage of questions for which the system retrieves at least one chunk containing correct answer information |
Mean Reciprocal Rank (MRR) measures the position of the first correct chunk in the returned result list. The higher the position, the higher the MRR score. |
20 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Value in RAG |
In the RAG model, which element represents the actual extracted information? |
Medium |
1 |
D |
Key |
Query |
Key vector dimension (d_k) |
Value |
Value represents the actual extracted information in the RAG model. |
21 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Multimodal RAG |
Which RAG development direction allows retrieving information from different types of data such as images, audio, and text? |
Easy |
1 |
A |
Multimodal RAG |
Internal RAG system |
Agentic RAG |
RAG Chatbot |
Multimodal RAG allows retrieving information from different data sources, not just text. |
22 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Agentic RAG |
Which type of RAG application has the ability to ask sub-questions and interact with external tools to gather information? |
Medium |
1 |
B |
Internal document RAG system |
Agentic RAG |
Multimodal RAG |
RAG Chatbot |
Agentic RAG is more proactive in gathering information by asking sub-questions and interacting with external tools. |
23 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Enterprise RAG |
Which RAG application helps employees search for information in the companyâs internal documents quickly and accurately? |
Easy |
1 |
D |
Multimodal RAG |
Research and specialized analysis assistant |
Smart customer support chatbots |
Enterprise internal document RAG system |
Enterprise internal document RAG systems help employees search for information quickly and accurately. |
24 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Interactive Learning |
What problem does RAG (Retrieval-Augmented Generation) application solve in interactive learning? |
Medium |
1 |
C |
Limited access to learning materials |
Inaccurate assessment of learning outcomes |
Boredom and passivity when learning through textbooks |
Lack of updated information in textbooks |
RAG creates interactive tools that allow students to interact with learning materials more actively compared to reading traditional textbooks. |
25 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Financial RAG |
In the financial field, how can RAG support analysts? |
Medium |
1 |
A |
Summarize and analyze risks from long financial reports |
Manage personal investment portfolios |
Predict stock market fluctuations |
Automatically create financial reports |
RAG can summarize and analyze risks from long financial reports, helping analysts save time and make decisions faster. |
26 |
Unit 1: Basic AI Fundamentals |
Lec1 |
E-commerce RAG |
How does RAG improve product recommendation systems on e-commerce sites? |
Medium |
1 |
A |
Retrieve information from detailed descriptions, product reviews, and technical specifications |
Optimize product prices based on competitors |
Provide 24/7 online customer support services |
Enhance the ability to predict customer needs |
RAG retrieves information from detailed descriptions, product reviews, and technical specifications to provide personalized recommendations, rather than relying solely on click history. |
27 |
Unit 1: Basic AI Fundamentals |
Lec1 |
RAG Distinctive Feature |
What is the distinctive feature of RAG compared to traditional generative AI systems? |
Medium |
1 |
D |
Integration with cloud platforms to increase scalability |
Using the most advanced deep learning algorithms |
Ability to automatically adjust parameters to optimize performance |
Combining the deep language capabilities of LLMs with the accuracy of external knowledge bases |
RAG combines the language capabilities of LLMs with the accuracy and up-to-date nature of external knowledge bases, creating more reliable and transparent AI applications. |
28 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Vector Database |
What is the primary purpose of a Vector Database in a RAG system? |
Easy |
1 |
B |
Store raw text documents for quick retrieval |
Store and efficiently search through vector embeddings |
Manage user authentication and access control |
Cache frequently asked questions and answers |
A Vector Database is specifically designed to store and efficiently search through vector embeddings, enabling fast similarity searches in the RAG pipeline. |
29 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Chunking Strategies |
Which chunking strategy maintains the logical structure of a document by splitting at natural boundaries? |
Medium |
1 |
C |
Fixed-size chunking |
Random chunking |
Semantic chunking |
Overlapping chunking |
Semantic chunking splits documents at natural boundaries (paragraphs, sentences, sections) to maintain logical structure and preserve meaning within each chunk. |
30 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Top-K Retrieval |
What does the âTop-Kâ parameter control in RAG retrieval? |
Easy |
1 |
A |
The number of most similar documents to retrieve |
The maximum length of each chunk |
The threshold for similarity scores |
The number of re-ranking iterations |
Top-K parameter controls how many of the most similar documents are retrieved from the vector database to provide context for the LLM. |
31 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Prompt Engineering |
In RAG systems, what is the role of the system prompt when generating answers? |
Medium |
1 |
B |
To store retrieved documents permanently |
To instruct the LLM on how to use the retrieved context to generate answers |
To perform the similarity search in the vector database |
To convert user queries into embeddings |
The system prompt instructs the LLM on how to use the retrieved context to generate accurate, grounded answers and may include formatting guidelines and constraints. |
32 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Answer Relevance |
What does âAnswer Relevanceâ measure in RAG evaluation? |
Medium |
1 |
C |
How fast the system generates responses |
The accuracy of the embedding model |
How well the generated answer addresses the userâs original question |
The number of retrieved documents used |
Answer Relevance measures how well the generated answer addresses the userâs original question, ensuring the response is pertinent and useful. |
33 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Context Window |
What limitation does the âcontext windowâ impose on RAG systems? |
Hard |
1 |
D |
The maximum number of documents that can be stored |
The time limit for generating responses |
The minimum similarity score for retrieval |
The maximum amount of text that can be processed by the LLM at once |
The context window limits the maximum amount of text (retrieved chunks + query + system prompt) that can be processed by the LLM at once, requiring careful management of chunk sizes. |
34 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Metadata Filtering |
What is the benefit of using metadata filtering in RAG retrieval? |
Medium |
1 |
A |
Narrow down search results based on document attributes before semantic search |
Increase the size of the vector database |
Speed up the embedding generation process |
Reduce the cost of LLM API calls |
Metadata filtering allows narrowing down search results based on document attributes (date, source, category) before or during semantic search, improving retrieval precision. |
35 |
Unit 1: Basic AI Fundamentals |
Lec1 |
Hallucination Prevention |
Which technique helps prevent hallucination in RAG systems by ensuring answers are grounded in retrieved content? |
Hard |
1 |
B |
Increasing the temperature parameter |
Instructing the LLM to only use information from the provided context |
Using larger embedding dimensions |
Reducing the Top-K value to 1 |
Instructing the LLM through the system prompt to only use information from the provided context and to say âI donât knowâ when information is not available helps prevent hallucination and ensures answers are grounded in retrieved content. |
RAG Optimization Theory#
Exam Theory: RAG and Optimization#
This exam theory focuses on assessing advanced topics within Retrieval-Augmented Generation (RAG) and its optimization techniques, drawing specifically from Advanced Indexing, Hybrid Search, Query Transformation, Post-Retrieval Processing, and GraphRAG Implementations.
No. |
Training Unit |
Lecture |
Training content |
Question |
Level |
Mark |
Answer |
Answer Option A |
Answer Option B |
Answer Option C |
Answer Option D |
Explanation |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What is a major disadvantage of fixed-size chunking when applied to large amounts of documents? |
Easy |
1 |
A |
It causes a loss of semantics by breaking ideas arbitrarily. |
It is too computationally expensive. |
It prevents vector search from indexing numbers. |
It requires advanced linguistic models to parse. |
Mechanical chunking accidentally breaks the flow of the text, making the LLM unable to understand the context when an idea is arbitrarily split. |
2 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
Why does Brute-force Flat Indexing become a serious problem as a system scales? |
Easy |
1 |
B |
It consumes too much disk space. |
It causes high latency when sequentially scanning millions of vectors. |
It is incompatible with neural network architectures. |
It only supports English text. |
Sequentially scanning through millions of vectors in a Flat Index is too slow to meet real-time requirements. |
3 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What is the core idea driving Semantic Chunking? |
Medium |
1 |
C |
To chunk text strictly by paragraph breaks. |
To split texts after exactly 1000 characters. |
To detect shifts to a new topic and perform a break precisely at the intersection of two topics. |
To summarize the text before splitting it. |
Semantic Chunking detects when sentences or content shift to a new topic (when vector direction abruptly changes) to perform a break. |
4 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What metric is typically calculated between consecutive sentences during Semantic Chunking? |
Medium |
1 |
A |
Cosine similarity |
Word count ratio |
Token frequency |
Character limits |
In Semantic Chunking, the similarity (for example cosine similarity) is calculated between the current sentence and the next one. |
5 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
In Semantic Chunking, when does the algorithm decide to split the text? |
Medium |
1 |
D |
When similarity is above 90%. |
After a fixed number of punctuation marks. |
When the sentence length exceeds the threshold. |
When similarity drops significantly below a threshold. |
If similarity drops significantly below the threshold, it means the topic has changed, breaking the chunk there. |
6 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What is a notable advantage of Semantic Chunking over Recursive Chunking? |
Medium |
1 |
B |
It runs extremely fast. |
It preserves ideas fully and perfectly follows the flow of text. |
It does not consume any computational resources. |
It is specifically designed for codebases. |
Semantic Chunking preserves ideas fully, strictly follows the text flow, and increases accuracy when searching. |
7 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What is a major disadvantage of Semantic Chunking? |
Easy |
1 |
C |
It cuts through important ideas frequently. |
It returns very noisy contexts. |
It consumes computational resources due to running a model to compare each sentence. |
It only works for legal or contract documents. |
Because it must run an ML model to compare the similarity of each consecutive sentence, it consumes computational resources. |
8 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What does HNSW stand for in the context of Vector Databases? |
Easy |
1 |
A |
Hierarchical Navigable Small World |
High Neural State Weights |
Heuristic Node Searching Window |
Hierarchical Numeric Sequence Word |
HNSW stands for Hierarchical Navigable Small World, an effective algorithm balancing retrieval speed and accuracy. |
9 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What kind of data structure does HNSW organize data into? |
Medium |
1 |
C |
A flat SQL table |
A chronological file system |
A multi-layered graph structure |
A raw byte stream |
HNSW organizes data in the form of a multi-layered graph structure utilizing short and long shortcut links. |
10 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
In HNSW, what is the role of Layer 0? |
Medium |
1 |
D |
It contains the shortest summary of the dataset. |
It stores the sparse shortcut links. |
It is empty and serves as a placeholder. |
It contains all data points and the most detailed links between them. |
Layer 0 contains all data points, and the most detailed links. It contains the most complete information to find the exact target. |
11 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
What does parameter |
Hard |
1 |
A |
The maximum number of links a node can create with neighbor nodes. |
The memory limit in megabytes. |
The number of documents returned. |
The margin of error allowed. |
M specifies the maximum number of links a node can create with other neighbor nodes. The larger M is, the denser the network. |
12 |
Unit 1: RAG and Optimization |
Lec 1 |
Advanced Indexing |
How should |
Hard |
1 |
B |
It should be set to 0. |
It should be kept at a low level (e.g., 50-100) to optimize latency. |
It should be set to maximum allowed bounds. |
It should equal the total number of documents. |
Keeping |
13 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What is an inherent weakness of standard Vector Search? |
Easy |
1 |
C |
It lacks speed when processing basic synonyms. |
It struggles with multilingual queries. |
It reveals weaknesses when encountering queries requiring absolute accuracy in wording. |
It ignores document meaning entirely. |
Vector Search reveals weaknesses when processing queries requiring absolute accuracy (e.g., proper names, error codes). |
14 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What exactly constitutes a Hybrid Search mechanism? |
Easy |
1 |
A |
Combining the power of semantic vector search with traditional keyword search. |
Merging structured and unstructured relational databases. |
Running two identical LLMs simultaneously. |
Compiling queries in both Python and Java. |
Hybrid search combines semantic search (Vector) and traditional keyword search (BM25). |
15 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
Which keyword frequency-based statistical algorithm is standard for Hybrid Search? |
Easy |
1 |
D |
BERT |
HNSW |
HyDE |
BM25 |
BM25 is the gold standard for traditional keyword retrieval algorithms in Hybrid Search. |
16 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
How does BM25 solve the keyword spamming problem found in TF-IDF? |
Medium |
1 |
B |
By manually blacklisting frequent spammers. |
By applying a saturation mechanism where scoring asymptotes after several keyword occurrences. |
By analyzing the semantic meaning of repetitive words. |
By deleting any document that repeats a word. |
BM25 applies a saturation mechanism so that appearing a 101st time hardly adds more score than the 10th time. |
17 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What does Inverse Document Frequency (IDF) do in the BM25 formula? |
Medium |
1 |
A |
It penalizes common words and massively rewards rare words. |
It ranks shorter documents higher than longer ones. |
It limits the number of query words sent to the server. |
It inverses the vectors created by the model. |
IDF penalizes common words heavily while attributing more importance and score weight to rare words. |
18 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
Why is Length Normalization an important feature of BM25? |
Medium |
1 |
C |
It forces all documents to be exactly 1000 characters. |
It compresses long queries to save bandwidth. |
A single keyword in a short paragraph gets rated higher than the same keyword diluted in a long novel. |
It converts all characters to lowercase. |
BM25 scales the score based on document length to prevent long documents from unfairly dominating over concise information. |
19 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
In a typical Hybrid Search pipeline, how are the two algorithms executed? |
Medium |
1 |
D |
Vector search completes first, then BM25 is run on the results. |
BM25 runs entirely locally before running Vector remotely. |
Only one is executed depending on a query classifier. |
They are executed in parallel simultaneously. |
The system sends the query simultaneously to both search engines (Parallel Execution). |
20 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
Why canât we simply add the BM25 score and the Vector Search score together? |
Hard |
1 |
B |
Vector search scores are negative integers. |
The scoring scales are fundamentally different (Vector uses [0, 1] cosine similarity; BM25 is arbitrary positive numbers). |
They are processed on different neural network architectures. |
BM25 produces alphabetical grading ranges. |
The scoring scales of the two algorithms are completely different and numerically incompatible directly. |
21 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What algorithm solves the score compatibility issue in Hybrid Search? |
Medium |
1 |
C |
GraphRAG Convolution |
Maximal Marginal Relevance |
Reciprocal Rank Fusion (RRF) |
TF-IDF Smoothing |
Reciprocal Rank Fusion (RRF) merges these two lists effectively. |
22 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
Upon what theoretical basis does Reciprocal Rank Fusion (RRF) operate? |
Hard |
1 |
A |
Instead of scores, it assumes that if a document appears at a high rank in both lists, it is certainly important. |
It averages the raw text chunks of both documents. |
It only accounts for the longest document. |
It uses an LLM to assign arbitrary ranks. |
RRF cares about rank rather than score; a high consensus of rank across disparate algorithms signifies an important document. |
23 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What is the purpose of the smoothing constant |
Hard |
1 |
D |
It identifies the number of total documents in the database. |
It sets the maximum allowed token count. |
It determines the strictness of exact keyword matching. |
It helps reduce score disparity between very high ranks, ensuring fairness. |
The constant \(k\) (usually 60) reduces massive score disparities between adjacent high ranks (like Top 1 vs Top 2), ensuring a smoother gradient of rank scoring. |
24 |
Unit 1: RAG and Optimization |
Lec 2 |
Hybrid Search |
What does Hybrid Search primarily sacrifice to gain balanced Context and Keyword accuracy? |
Easy |
1 |
B |
Security and Privacy |
System resources, as it is complex to deploy and consumes resources running 2 parallel streams. |
API documentation clarity |
Multi-lingual support |
Hybrid Search is more complex to deploy and consumes more resources due to running parallel streams simultaneously. |
25 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
Why do raw user questions often yield poor Vector Search results natively? |
Easy |
1 |
C |
LLMs cannot read unformatted text. |
Vector databases reject single words. |
Questions are short/interrogative, lacking context compared to long descriptive documents. |
Search algorithms intentionally delay short queries. |
Vector Search faces semantic asymmetry; questions are short and interrogative while documents are long and descriptive. |
26 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
What is the core idea of Query Transformation? |
Easy |
1 |
A |
Using an LLM to rewrite, expand, or break down the userâs question into better versions before searching. |
Encrypting user queries before transmission. |
Replacing semantic searches with strict SQL SELECT queries. |
Running the userâs prompt through a grammar checker. |
It uses an LLM to intelligently edit, expand, or rewrite poor raw queries before sending them to the lookup department. |
27 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
What does HyDE stand for in Query Transformation? |
Medium |
1 |
B |
Heavy Yield Database Execution |
Hypothetical Document Embeddings |
Hybrid Y-axis Dense Encapsulation |
Hex-layered Data Encryption |
HyDE stands for Hypothetical Document Embeddings. |
28 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
What happens during the âGenerateâ phase of a HyDE strategy? |
Medium |
1 |
D |
It generates Python scripts. |
It generates a dense vector representing the question. |
It generates an index mapping inside the SQL table. |
The system asks the LLM to write a hypothetical answer paragraph for the userâs question. |
The LLM is forced to draft a fake, hypothetical answer for the question so it matches the expected document vocabulary. |
29 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
Does the hypothetical âfake answerâ drafted in HyDE need to be factually correct? |
Hard |
1 |
B |
Yes, exact factual accuracy guarantees precise matches. |
No, but the writing style and technical vocabulary should resemble the actual document. |
Yes, the model refuses to output hallucinated responses. |
No, it just generates a sequence of random numbers. |
The information in the paragraph might be factually incorrect, but its style and technical vocabulary mimic real documents to enable better semantic matching. |
30 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
Why is the vector generated from the âfake answerâ in HyDE more useful than the userâs question vector? |
Medium |
1 |
A |
The fake answer vector is semantically closer to the real document vector than the short interrogative question vector. |
It consumes 0 RAM. |
It maps perfectly to sparse BM25 arrays. |
The userâs query vector is permanently deleted. |
The drafted answer contains similar sentence structures/buzzwords to real documents, closing the asymmetric semantic gap. |
31 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
When is the Query Decomposition strategy particularly useful? |
Medium |
1 |
C |
When querying single words. |
When parsing simple FAQ menus. |
When a question requires comparing or aggregating information from multiple independent scattered sources. |
When reading codebases in completely unknown programming languages. |
It handles complex multi-intent questions comparing or gathering data from multiple sources where a single text snippet fails to contain the whole answer. |
32 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
What happens during the first phase (Breakdown) of Query Decomposition? |
Medium |
1 |
A |
The LLM analyzes the original question and splits it into a sequence of separate independent sub-questions. |
The system shreds the database documents into chunks. |
The LLM provides the final answer immediately without searching. |
The database is partitioned across multiple distinct servers. |
The system identifies multi-intent questions and logically breaks them into single-intent targeted sub-questions. |
33 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
How does Query Decomposition run searches for multiple sub-questions? |
Medium |
1 |
B |
It merges all sub-questions back into one query. |
It performs standard document searches individually for each separate sub-question. |
It relies exclusively on cached external queries. |
It skips queries containing conjunctions. |
It executes distinct targeted retrieval queries for every identified independent sub-question. |
34 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
Which phase of Query Decomposition requires the LLM to process text found from all separate sub-searches? |
Easy |
1 |
C |
Breakdown |
Encapsulation |
Synthesis |
Verification |
In Synthesis, text segments found from all previous distinct steps are aggregated and fed into the LLM to form a complete final answer. |
35 |
Unit 1: RAG and Optimization |
Lec 3 |
Query Transformation |
In summary, what role does Query Transformation act as? |
Easy |
1 |
D |
An internet firewall proxy. |
A database administrator deleting old records. |
A compiler translating queries to binary. |
An intelligent editor reorienting questions to ensure the system correctly understands true intent. |
It performs intelligent preprocessing (via drafting or splitting) so concise or poor user queries execute properly against the technical index. |
36 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
Why is the Top-K list returned directly from standard retrievers often suboptimal for an LLM? |
Medium |
1 |
A |
Standard embedding models trade deep semantic accuracy for retrieval speed, and may return contextually incorrect ânoisyâ keyword matches. |
The returned list is usually empty. |
The standard top-K size is too large for modern hardware. |
The returned documents are always translated to a random language. |
Embedding models heavily prioritize index speed over complex relationship comprehension, often returning documents with matching keywords but wrong contextual intents. |
37 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What represents the main goal of Re-ranking in a RAG pipeline? |
Easy |
1 |
C |
To randomly shuffle the document list. |
To format the output HTML for the frontend. |
To act as a final filter processing a small pool of candidates to pick the absolutely best ones. |
To permanently alter the dataset ordering. |
Re-ranking takes a small pool (like 50) and spends extra computational time reading them carefully to pick the top 5 highest-quality documents. |
38 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What architectural method do standard Embedding Models use during the Retrieval step? |
Medium |
1 |
B |
Graph-Encoder |
Bi-Encoder |
Cross-Encoder |
Recursive-Encoder |
Retrieval embeddings process questions and documents separately via Bi-Encoders. |
39 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What is the major pros and cons of the Bi-Encoder architecture? |
Hard |
1 |
A |
Fast speed (via pre-computation), but loses detailed nuanced interaction information between question and document words. |
Extreme accuracy, but consumes too much API quota. |
Perfectly handles complex negations, but fails at simple keywords. |
It guarantees data privacy, but prevents external web searches. |
Because the vectors are calculated independently ahead of time, it runs fast but misses deeper interrelated context (like negations vs subjects). |
40 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
How does a Cross-Encoder fundamentally differ from a Bi-Encoder? |
Hard |
1 |
D |
It translates everything into Spanish. |
It maps vectors onto a graph database exclusively. |
It bypasses the attention mechanism entirely. |
The question and document are concatenated into a single text sequence, processed simultaneously via a full Self-Attention mechanism. |
Instead of separated outputs, Cross-Encoders read both strings concurrently to understand complex logic, negation, and interactions between all words simultaneously. |
41 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
If Cross-Encoders are incredibly accurate, why donât we use them to search the entire database? |
Medium |
1 |
C |
They cannot run on GPUs. |
They only output integers. |
They are very slow and resource-consuming to run across millions of documents. |
They are blocked by vector database protocols. |
Processing millions of documents concurrently through strict Self-Attention is too computationally slow. |
42 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What describes the Funnel Strategy in Post-Retrieval? |
Medium |
1 |
B |
Running Bi-Encoder and Cross-Encoder on separate clusters entirely. |
Using Bi-Encoder to fast-retrieve a Top 50, then using Cross-Encoder to slowly re-score those 50 into a Top 5. |
Splitting documents into smaller funnels based on character limits. |
Re-ranking the vector database before queries arrive. |
The funnel strategy accepts speed from Bi-Encoders (for finding 50 items) and precision from Cross-Encoders (for filtering to 5). |
43 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
In scenarios dealing with biological negation (e.g., âWhat does Python NOT eatâ), why does a Cross-Encoder succeed where a Bi-Encoder fails? |
Hard |
1 |
A |
The Cross-Encoder recognizes the negation structure and biological context perfectly since it reads the query and document concurrently. |
The Cross-Encoder has a specialized biology database pre-installed. |
The Bi-Encoder deletes the word âNOTâ. |
The Cross-Encoder ignores keywords entirely. |
Bi-Encoders mistakenly link the keywords âPythonâ and âeatâ, while Cross-Encoders accurately recognize the negation modifier mapping to the biological logic. |
44 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What does MMR stand for in the context of Post-Retrieval processing? |
Medium |
1 |
D |
Minimum Marginal Rating |
Multi-Model Retrieval |
Memory Mapping Resolution |
Maximal Marginal Relevance |
MMR stands for Maximal Marginal Relevance, an algorithm used to diversify query results. |
45 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
What twofold problem does MMR aim to solve when selecting final documents? |
Medium |
1 |
B |
Size vs Compression |
Relevance to the query vs Diversity to prevent identical redundant documents. |
API Latency vs Local Storage |
Token allowance vs Security constraints |
When similarity returns 5 identical paragraphs of text, MMR resolves the redundancy by ensuring selected documents are relevant but distinctly diverse. |
46 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
In the MMR algorithm, what occurs after picking the most similar document (Step 1)? |
Hard |
1 |
C |
The system clears the cache. |
The system returns immediately. |
It finds the next document similar to the query but least similar to previously selected documents. |
It picks the document that is completely irrelevant to the query. |
Step 2 balances relevance by filtering for the next document containing the queryâs answer but differing heavily from the document already selected. |
47 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
In the MMR optimization formula, what does lowering lambda (\(\lambda\)) do? |
Hard |
1 |
A |
Priorities diversity by increasing the penalty for selecting text similar to existing selected documents. |
Causes the system to crash. |
Forces exact keyword matching. |
Elevates relevance entirely over diversity. |
Decreasing lambda gives more mathematical priority to the diversity penalty section of the MMR formula, forcing varied information. |
48 |
Unit 1: RAG and Optimization |
Lec 4 |
Post-Retrieval |
If a user asks a broad question (âFeatures of VF8 Carâ) and wants comprehensive overall coverage, which Re-ranker is optimal? |
Medium |
1 |
C |
Flat Indexing |
Recursive Chunking |
Maximal Marginal Relevance (MMR) |
Simple Bi-Encoder similarity |
MMR guarantees diverse, non-redundant documents giving the LLM text detailing multiple broad vehicle features, not just repeated text about its engine. |
49 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What does GraphRAG combine to create a comprehensive knowledge representation system? |
Easy |
1 |
B |
Cloud storage and Edge devices |
Structured graph databases with vector-based retrieval |
Dense and Sparse chunking limits |
Hybrid APIs and NoSQL mappings |
GraphRAG merges structured graph DBs (like Neo4j) and vector retrieval. |
50 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What popular graph database is used for storing GraphRAG entities in the implementation example? |
Easy |
1 |
A |
Neo4j |
PostgreSQL |
ElasticSearch |
MongoDB |
Neo4j is utilized to construct and store the nodes and relationship graphs. |
51 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What is the purpose of Pydantic models in the implementation pipeline? |
Medium |
1 |
D |
To render the Neo4j visualization frontend. |
To manage API timeout failures. |
To download PDF files correctly. |
To enforce validation schemas for structured entity/relationship output from the LLM. |
Pydantic classes like |
52 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
According to the implementation extraction rules, what constitutes a âcommitmentâ? |
Medium |
1 |
C |
Simple definitions and jargon. |
Any sentence ending in a period. |
A clear promise, obligation, or prohibition found in the text. |
A numeric calculation executed by the CPU. |
The LLM is instructed to identify clear promises, obligations, or prohibitions as Commitments. |
53 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
How are measurable numeric limits inside obligations handled during extraction? |
Hard |
1 |
D |
They are discarded mathematically. |
They are summed together. |
They are sent to a calculator API. |
They are explicitly extracted as Constraint unit parameters. |
If a commitment contains numeric limits, the agent extracts them strictly as linked Constraints. |
54 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What does the |
Medium |
1 |
A |
Forces the LLM to reply via JSON adhering precisely to the Pydantic schema class. |
Translates the output into Neo4j graph visualizations natively. |
Prevents the model from reading files. |
Outputs Python code running in a sandbox. |
It guarantees the unstructured text processed by the ChatGPT API is accurately deserialized back into structured |
55 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
In the designed graph schema, what do |
Easy |
1 |
C |
The user identities processing the data. |
The hardware metrics. |
The overarching policy topics/units from chunked texts. |
The exact numeric values from commitments. |
PolicyClause nodes store the actual chunked policy texts/topics serving as central nodes linking other entities. |
56 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
In Cypher (Neo4j), which operation ensures duplicate nodes are not created during ingestion? |
Medium |
1 |
B |
INSERT IGNORE |
MERGE |
UPSERT |
ADD DISTINCT |
Using the |
57 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
How are |
Hard |
1 |
A |
Via the |
Via a standalone |
Via |
They are completely unlinked. |
Stakeholder nodes reflect affected parties, mapped using |
58 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What represents a distinct advantage of GraphRAG over standard vector similarity search? |
Medium |
1 |
B |
It consumes zero system memory. |
Relationships explicitly define how entities connect, solving queries needing context-aware traversal mapping. |
It requires no chunking. |
It automatically resolves grammatical mistakes. |
Graph traversal natively exposes how discrete entities explicitly connect, answering intricate logical queries that vector distances alone cannot deduce. |
59 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
Which LangChain module converts natural language into Cypher queries for the LLM? |
Medium |
1 |
A |
GraphCypherQAChain |
VectorDBQAChain |
PydanticOutputParser |
DocumentConverter |
|
60 |
Unit 1: RAG and Optimization |
Lec 5 |
GraphRAG |
What is noted as a core limitation or consideration when implementing GraphRAG? |
Medium |
1 |
D |
It deletes all prior indexes upon restart. |
It requires user authentication before every search. |
The LLM must be hosted locally. |
It relies heavily on specific types of structured data linking to form an effective knowledge base. |
GraphRAGâs power originates strictly from highly structured data mappings; mapping unstructured erratic data yields poor relationships. |
LangGraph and Agentic AI Theory#
Final Exam#
No. |
Training Unit |
Lecture |
Training content |
Question |
Level |
Mark |
Answer |
Answer Option A |
Answer Option B |
Answer Option C |
Answer Option D |
Explanation |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 |
LangGraph & Agentic AI |
Lec1 |
State Management |
What is the core field used for ALL input/output from nodes in a LangGraph State? |
Easy |
1 |
C |
|
|
|
|
The |
2 |
LangGraph & Agentic AI |
Lec1 |
State Management |
Which concept allows LangGraph to support complex workflows compared to standard LangChain chains? |
Easy |
1 |
B |
Linear flows only |
Cyclic flows and conditional routing |
Stateless operations |
Basic sequential pipelines |
Extends basic chains with cyclic flows and conditional routing for loops / complex logic. |
3 |
LangGraph & Agentic AI |
Lec1 |
State Management |
What is the role of |
Easy |
1 |
A |
Appending new messages and handling deduplication |
Deleting old messages automatically |
Summarizing long conversations |
Replacing the current message list with a new one |
|
4 |
LangGraph & Agentic AI |
Lec1 |
State Management |
Which of the following is NOT a standard LangChain message type used in LangGraph? |
Easy |
1 |
D |
|
|
|
|
Standard types are |
5 |
LangGraph & Agentic AI |
Lec1 |
State Management |
In LangGraphâs State structure, what should non-conversational context like |
Easy |
1 |
B |
Sent directly to the LLM response |
Storing configuration and metadata |
Replacing the standard message history |
Caching LLM tokens |
Context fields are meant for metadata and configuration, not standard I/O messages. |
6 |
LangGraph & Agentic AI |
Lec1 |
State Management |
Which object serves as the core director engine orchestrating LLM workflows in LangGraph? |
Easy |
1 |
D |
|
|
|
|
|
7 |
LangGraph & Agentic AI |
Lec1 |
State Management |
How does LangGraph handle context injection before starting the graph execution? |
Medium |
1 |
C |
By loading it from an external JSON file automatically. |
By sending a special |
By initializing the state with context variables when calling |
Context cannot be injected; the LLM must generate it. |
Context is provided to |
8 |
LangGraph & Agentic AI |
Lec1 |
State Management |
When building a multi-agent system, how do different agents (nodes) share findings with one another in a messages-centric pattern? |
Medium |
1 |
A |
By appending |
By modifying the global |
By resetting the |
By sending direct peer-to-peer API calls bypassing the state. |
Agents append named |
9 |
LangGraph & Agentic AI |
Lec1 |
State Management |
What is the primary purpose of adding nodes and edges to a |
Medium |
1 |
D |
To train a new deep learning model. |
To clean the data before input into a LangChain chain. |
To replace the standard LLM reasoning layers. |
To map out functions as nodes and execution paths as edges. |
Nodes represent functions/agents; edges dictate the workflow paths and conditionals. |
10 |
LangGraph & Agentic AI |
Lec1 |
State Management |
If an LLM node returns |
Medium |
1 |
B |
It merges the new message safely. |
It overwrites the existing message list. |
It throws a syntax error. |
It drops the message entirely. |
Without a reducer like |
11 |
LangGraph & Agentic AI |
Lec1 |
State Management |
According to LangGraph Best Practices, why should conversational data (I/O) be kept strictly in |
Hard |
1 |
B |
Because LangChain parsers crash if state contains integers. |
It enables robust State Persistence (Checkpointers) which rely on deterministic, append-only message histories. |
It saves tokens directly since context fields are automatically hidden from the LLM. |
Context fields are only valid in the |
Checkpointers reconstruct and replay the state efficiently when conversational history relies on the standardized, append-only messages slice. |
12 |
LangGraph & Agentic AI |
Lec1 |
State Management |
How can conditional routing leverage the State to decide whether to call a tool or end the workflow? |
Hard |
1 |
A |
By inspecting |
By manually polling an external database at every node. |
By counting the number of characters in the previous |
By throwing an exception when the state is exhausted. |
The conditional edge function looks at the last message to see if the LLM populated |
13 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
What does the ReAct pattern stand for in agentic workflows? |
Easy |
1 |
B |
Refresh and Activate |
Reason and Act |
Respond and Acknowledge |
Request and Action |
ReAct combines explicit reasoning (Think) before acting (Tool Use) in a loop. |
14 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
Why is a Multi-Expert pattern generally preferred over a single generic web search tool for complex research? |
Easy |
1 |
A |
It provides specialized domain knowledge and structured reasoning. |
It uses fewer tokens. |
It operates completely offline. |
It requires zero prompt engineering. |
Specialized LLMs acting as tools provide better domain insights and consistent reasoning. |
15 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
What is the purpose of the |
Easy |
1 |
D |
To prompt the LLM to generate code. |
To browse the internet using a headless browser. |
To compress message history. |
To automatically handle the parsing and execution of multiple tools. |
|
16 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
In a ReAct loop, what is the sequence of steps the coordinator LLM usually follows? |
Easy |
1 |
C |
Act \(\to\) Think \(\to\) Stop |
Observe \(\to\) Act \(\to\) Think |
Think \(\to\) Act \(\to\) Observe |
Stop \(\to\) Observe \(\to\) Think |
The standard ReAct loop is: Think (Reason), Act (Call Tool), Observe (Tool Result), and Repeat. |
17 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
What is a common way to prevent an agent from getting trapped in an infinite ReAct loop? |
Easy |
1 |
B |
Disabling all tools permanently. |
Adding an |
Forcing the LLM to answer in 10 words or less. |
Unplugging the server. |
Checking an iteration limit in the conditional edge is best practice to stop runaway loops. |
18 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
How do Multi-Expert Tools differ technically from standard external API tools (like web search) inside a LangGraph setup? |
Easy |
1 |
C |
They donât use the |
They execute JavaScript code. |
They are themselves LLM invocations with specialized system prompts. |
They bypass the |
Expert tools invoke another instance of an LLM primed with a specific expert persona. |
19 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
If an agent is deciding which expert to call during the âActâ phase, what enables the LLM to provide structured function calls automatically? |
Medium |
1 |
B |
Regular Expressions parsing. |
Using |
Writing manual JSON format instructions in the prompt. |
Training a custom fine-tuned router model. |
|
20 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
What is the main architectural upgrade introduced when adding a Planning Agent to a simple ReAct flow? |
Medium |
1 |
A |
The Coordinator is relieved of analyzing the userâs initial message; a separate Planner handles decomposition first. |
Tools are executed synchronously without LLM intervention. |
The agent switches to using a completely different model provider. |
State management is no longer required. |
A Planner separates the complex task of understanding and task decomposition from the execution/coordinator task. |
21 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
During the âObserveâ phase of standard ReAct with Langgraph |
Medium |
1 |
D |
|
|
|
|
After executing a tool, |
22 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
What happens if multiple expert tools are called simultaneously by the Coordinator LLM? |
Medium |
1 |
B |
They are ignored and skipped. |
The |
The graph crashes due to a concurrency error. |
Only the first tool is executed. |
Modern models can return multiple tool calls at once, which |
23 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
In a robust production-ready Multi-Expert Research agent, how should tool execution failures be handled? |
Hard |
1 |
D |
By shutting down the LangGraph server. |
By letting the unhandled exception crash the application so developers can debug. |
By automatically switching model providers mid-workflow. |
By catching the exception inside the tool or custom node and returning a |
Returning the error as a string message allows the Coordinator LLM to âReasonâ about the failure and take alternative action. |
24 |
LangGraph & Agentic AI |
Lec2 |
Agentic Patterns |
Why does a Multi-Expert ReAct pattern consume significantly more tokens than a simple linear agent? |
Hard |
1 |
C |
Because it stores all memory in a vector database. |
Because LangGraph adds a large metadata overhead to every variable. |
The complete conversation history ( |
Because expert LLMs generate longer responses to simple questions. |
In ReAct loops, the context window GROWS each cycle as new |
25 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
What is the main difference between traditional LLM prompts and Tool Calling capabilities? |
Easy |
1 |
D |
Prompts use more tokens. |
Tool Calling avoids external APIs. |
Tool Calling is only available in open-source models. |
Tool Calling enables the model to issue structured JSON parameters to invoke external code automatically. |
Structural return formats from the LLM via defined JSON schemes is the core innovation in Tool Calling. |
26 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
Which terminology specifically refers to OpenAIâs native API parameter for passing a JSON schema? |
Easy |
1 |
A |
|
|
|
|
OpenAI specifically categorizes the schema object passing under âFunction Calling.â |
27 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
Which python decorator is used in LangChain to easily convert a standard Python function into a Tool? |
Easy |
1 |
C |
|
|
|
|
The |
28 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
What makes Tavily Search specifically optimized for AI applications compared to standard generic web search APIs? |
Easy |
1 |
B |
It is slower but cheaper. |
It pre-formats results for LLMs, filters noise, and provides context for RAG. |
It only searches Wikipedia. |
It bypasses the internet using a local database. |
Tavily removes clutter (HTML/Ads) and extracts clean content structured for immediate LLM context window ingestion. |
29 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
What is a common best practice regarding Tool Descriptions in the code? |
Easy |
1 |
A |
They should be highly detailed so the LLM knows exactly when and how to call the tool. |
They are ignored by the LLM, so they can be left blank. |
They must be written in JSON. |
They should be under 5 words to save tokens. |
High-quality descriptions help the model âReasonâ appropriately about when the tool is useful. |
30 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
What is âTool Chainingâ? |
Easy |
1 |
D |
Storing tool outputs in a blockchain. |
Running the same tool 100 times to check consistency. |
Restricting tool execution to an administrator. |
Using the output of one tool as the direct input argument for another tool recursively. |
A common pattern is having one toolâs result guide the parameter execution of the next tool (like extracting a company name, then passing a stock ticker to a finance tool). |
31 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
How should developers securely manage API keys (like |
Medium |
1 |
B |
Hardcoding them at the top of the python script. |
Using Environment Variables or a Secret Management service (like Azure KeyVault). |
Passing them directly inside the user prompt. |
Storing them inside the |
Best practices strongly dictate loading secrets via ENV variables (e.g. |
32 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
When handling tool execution errors (such as network timeouts or API failures), what is the recommended fallback strategy? |
Medium |
1 |
C |
Raising a fatal exception to stop the script immediately. |
Silently ignoring the error and proceeding with an empty string. |
Catching the exception and returning a |
Switching to an older language model automatically. |
Returning the exception as a string in |
33 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
What optimization technique can significantly reduce duplicate external API calls from tools? |
Medium |
1 |
A |
Implementing a caching layer (e.g. |
Disabling the |
Limiting the LLM to 1 iteration entirely. |
Removing the system prompt. |
Caching recent tool queries locally drastically saves external latency and cost for repeated inquiries. |
34 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
If you want to use a Custom Tool class in LangChain instead of a decorator, which base class must you inherit from? |
Medium |
1 |
D |
|
|
|
|
Class-based tools need to inherit from |
35 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
How does the Tavily API |
Hard |
1 |
C |
It executes SQL queries on the backend instead. |
It forces the agent to ask the user permission. |
It performs a multi-step semantic search to extract comprehensive answers rather than returning simple link snippets. |
It parses local PDF files instead of the web. |
Advanced depth leverages an AI sub-agent during search to synthesize answers and return higher-quality textual analysis. |
36 |
LangGraph & Agentic AI |
Lec3 |
Tool Calling |
When building an architecture where an Orchestrator routes tasks, why would you implement a specific âWeb Search Agentâ rather than just giving the generic tools directly to the primary assistant? |
Hard |
1 |
B |
Because the primary assistant cannot accept tools format APIs. |
To separate concerns: a specialized agent can execute multi-step tool queries recursively without overloading the main routerâs prompt context. |
Because Tavily Search restricts execution to sub-nodes by design. |
Web Search agents use zero tokens. |
Sub-agents handle the cognitive load of browsing, reading snippets, and re-searching autonomously, returning only polished synthesis to the main router. |
37 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
What is the main structural advantage of a Hierarchical (Supervisor) multi-agent system? |
Easy |
1 |
A |
A Primary Assistant coordinates user intent and cleanly routes requests to specialized sub-agents. |
Every agent talks to every other agent at the same time. |
It prevents the use of external APIs. |
It runs on a single linear LangChain pipeline. |
Supervisors manage the workflow orchestration cleanly while sub-agents handle specific deep domains. |
38 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
Why would a system designer choose multi-agent architectures over a single sophisticated LLM? |
Easy |
1 |
C |
Single LLMs cannot use Python code. |
A single LLM always hallucinates. |
It promotes specialization, modularity, parallel processing, and avoids prompt overloading. |
Multi-agent systems guarantee faster latency in all scenarios. |
Splitting into separate specialized models (e.g., Architect, Coder, Reviewer) improves accuracy and creates maintainable codebases. |
39 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
What does a Network (Peer-to-Peer) coordination pattern imply? |
Easy |
1 |
C |
Agents are executed manually by humans. |
All agents must report back to a supervisor before interacting. |
Agents can communicate with each other directly without central supervision. |
It is a centralized routing protocol. |
Unlike supervisors, peer-to-peer agents message each other directly to resolve tasks. |
40 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
In a Hierarchical system, how does a Sub-Agent signal that its task is complete and it wishes to return control to the Primary Assistant? |
Easy |
1 |
D |
By crashing the program. |
By calling the end user via SMS. |
By erasing the shared stateâs message list. |
By executing a âCompleteOrEscalateâ tool call, signaling the workflow to pop the dialog stack. |
The common pattern relies on returning a specific signal (like |
41 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
In multi-agent LangGraph architectures, what prevents agents from losing the overarching conversation context? |
Easy |
1 |
B |
They read the local filesystem. |
They all read and append to a centralized shared |
The developer manually pastes the JSON transcript into each prompt. |
They query a vector database at every step. |
Shared TypedDict State containing |
42 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
What is the purpose of the |
Easy |
1 |
A |
To push and pop agent identifiers corresponding to the current active agent in the conversation tree. |
To log errors to a debugging console. |
To translate different languages. |
To count the number of LLM tokens used. |
The dialog stack ( |
43 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
What is âContext Injectionâ referring to in multi-agent tool execution? |
Medium |
1 |
D |
Injecting system prompts into the vector database. |
Overriding the userâs internet connection. |
Re-training the model mid-conversation. |
Automatically supplying known session metadata (like |
Context fields defined in the |
44 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
How do routing functions (conditional edges) decide to shift execution from the Primary Assistant to a designated Sub-Agent? |
Medium |
1 |
C |
The user types âRouteâ in the chat window. |
A random hash evaluates to true. |
By inspecting the |
They execute raw SQL queries tracking agent status. |
Standard routers look at the Assistantâs final |
45 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
Why might an agentic architecture include an âEntry Nodeâ when transitioning to a child agent? |
Medium |
1 |
B |
To charge the user additional credits. |
To silently append a |
To block external api requests permanently. |
To delete previous session checkpoints. |
Entry nodes serve as a trampoline, providing localized instructions to the incoming sub-agent without confusing the Primary Assistantâs prompt. |
46 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
During multi-agent fallback, what happens when a tool execution fails inside an agentâs subgraph? |
Medium |
1 |
A |
A custom |
The |
The system crashes. |
It switches out the open-source LLM for an OpenAI model. |
A structured fallback catcher prevents silent failures or crashes and turns exceptions into conversational events the agent can rectify. |
47 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
In a highly complex Competitive multi-agent arrangement, how do agents ultimately converge on a single answer? |
Hard |
1 |
C |
They execute a random dice roll. |
The graph hangs infinitely until restarted. |
A separate Evaluator/Synthesizer agent compares the outputs of all competing agents and selects or merges the best response into the final message. |
Only the agent that responds first is recorded in state. |
Competitive architectures require downstream synthesis nodes that âObserveâ multiple paths and judge the optimal conclusion analytically. |
48 |
LangGraph & Agentic AI |
Lec4 |
Multi-Agent Collab |
Consider the structure: |
Hard |
1 |
B |
It adds a third string to the stack. |
It returns the list to |
It deletes the entire stack. |
It loops infinitely within |
The custom reducer pops the last active element ( |
49 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
Why is a âHuman-in-the-Loopâ (HITL) step strongly recommended for applications performing financial transactions? |
Easy |
1 |
A |
They involve irreversible critical actions that require human oversight to prevent costly AI mistakes. |
It accelerates the transaction speed natively. |
Models cannot do math. |
HITL is an obsolete pattern replaced by GPT-4. |
Financial transactions are high-stakes operations requiring human intervention and compliance audit trails before final execution. |
50 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
In LangGraph, what prevents all computation from being lost when an agent pauses to wait for human input? |
Easy |
1 |
C |
Writing logs to a simple text file. |
LangChainâs built-in |
LangGraphâs native Checkpointing mechanism (e.g., |
Caching the prompt on the client side. |
Checkpointers serialize the exact state graph, allowing it to rest safely in memory or DB until resumed. |
51 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
How does passing |
Easy |
1 |
B |
It forces the node to timeout after 3 seconds. |
It suspends execution right before the specified node executes, returning control back to the application. |
It skips the node altogether. |
It triggers an infinite loop of human questions. |
|
52 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
What is the main drawback of using |
Easy |
1 |
D |
It requires setting up a massive cluster. |
It runs too slowly for modern models. |
It writes to a file that fills up the hard drive instantly. |
Checkpoints disappear completely when the python process drops or server restarts. |
|
53 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
Which checkpointer is recommended for a scalable, production-grade distributed LangGraph service? |
Easy |
1 |
C |
|
|
|
|
|
54 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
How does LangGraph distinguish parallel user conversations hitting the same graph application simultaneously? |
Easy |
1 |
B |
By creating separate python processes. |
By assigning each conversation a unique |
By deleting the older usersâ conversations. |
By using separate API keys. |
|
55 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
What information does LangGraphâs |
Medium |
1 |
A |
A complete historical log of all checkpointed states, parent markers, and metadata modifications across a conversation. |
Only the very first |
The system prompt token usage. |
Live streaming characters from the LLM. |
Pulling state history allows time-travel debugging and viewing the explicit step-by-step data modification over the threadâs lifespan. |
56 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
Given a graph paused before a âPublishingâ node, what code pattern can update the state manually, say, switching |
Medium |
1 |
C |
|
Modifying the global variables inside the python script. |
Calling |
Redefining the TypedDict. |
|
57 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
Why would a multi-agent framework require separate short-term Checkpointers vs explicit long-term external vector databases? |
Medium |
1 |
D |
Because LangChain deprecates long-term storage natively. |
Short-term databases always truncate after 1 megabyte. |
To prevent open-source models from scraping data. |
Checkpointers handle immediate conversational state securely per thread, while Vector stores aggregate historical knowledge and profiles persistently across unrelated sessions. |
Checkpointers = Thread-scoped conversational state. VectorDB = Global user-scoped background context fetching. |
58 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
How does the |
Medium |
1 |
B |
It overwrites the database completely. |
It creates a new |
It throws a primary key error. |
It switches back to |
The DB schema retains parent-child snapshot ID graphs, effectively allowing true non-destructive time travel. |
59 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
If an agent architecture has a manual Node simulating an âAs-Nodeâ state update ( |
Hard |
1 |
C |
The app skips ahead 10 checkpoints automatically. |
The update is discarded silently because the node was skipped. |
It behaves as if the actual |
The agent loops forever. |
|
60 |
LangGraph & Agentic AI |
Lec5 |
Human-in-the-Loop |
In a scenario where an AI is suggesting Medical treatment protocols, how might |
Hard |
1 |
A |
Pausing after the |
Halting the system if the internet disconnects. |
Interrupting the LLM mid-token generation. |
Making the LLM stream results to a text-to-speech engine. |
This allows the state to fully materialize the AIâs proposal, giving the human doctor a complete object to assess before continuing. |
LLMOps and Evaluation Theory#
LLMOps and Evaluation Question Bank#
No. |
Training Unit |
Lecture |
Training content |
Question |
Level |
Mark |
Answer |
Answer Option A |
Answer Option B |
Answer Option C |
Answer Option D |
Explanation |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
What does the Faithfulness metric measure in RAGAS? |
Easy |
1 |
A |
The truthfulness of the generated answer compared to the retrieved context |
The relevance of the answer to the original question |
The accuracy of the ranking of contexts |
The coverage of the retrieval process |
Faithfulness checks if all statements in the answer can be supported by the retrieved context, avoiding hallucinations. |
2 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
Which LLM framework is RAGAS designed to evaluate? |
Easy |
1 |
B |
Agents |
RAG systems |
Fine-tuned models |
Traditional Search Engines |
Ragas is an automated evaluation framework designed specifically for RAG systems. |
3 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
What do you need to annotate data manually when using RAGAS? |
Easy |
1 |
C |
Large scale human annotations |
Only expert domain knowledge |
Nothing, it uses LLMs like GPT-4 to automate evaluation |
Both standard Q&A pairs and ranking queries |
Unlike traditional methods, Ragas uses LLMs to automate the evaluation process without needing heavy human annotations. |
4 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
Which dimension is measured by Context Precision? |
Easy |
1 |
C |
Quality of generation |
Semantic similarity to the user query |
Accuracy of the retrieval process |
Coverage of expected facts |
Context Precision measures the accuracy of the retrieval process by assessing the ranking of contexts. |
5 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
What is the main purpose of Answer Relevancy? |
Easy |
1 |
D |
Fact-checking the answer |
Verifying truthfulness |
Guaranteeing context coverage |
Measuring relevance between answer and original question |
It evaluates the relevance between the answer and question to confirm it addresses the problem asked. |
6 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
What value range do Ragas metrics return? |
Easy |
1 |
B |
0 to 100 |
0 to 1 |
-1 to 1 |
1 to 5 |
Each metric gives a value from 0 to 1, with higher values indicating better quality. |
7 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
Which metric evaluates if relevant chunks are ranked high in retrieved contexts? |
Easy |
1 |
C |
Faithfulness |
Context Recall |
Context Precision |
Answer Relevancy |
Context Precision checks if relevant chunks are ranked high in the list of retrieved contexts. |
8 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
How many main metrics are covered in the RAGAS documentation? |
Easy |
1 |
A |
4 |
5 |
3 |
6 |
The four main metrics are faithfulness, answer relevancy, context precision, and context recall. |
9 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
If Context Recall is 0, what does that indicate? |
Easy |
1 |
A |
Retriever failed to find necessary context |
Rank 1 is an irrelevant context |
LLM generated hallucination |
The answer is irrelevant to the query |
It indicates the retriever failed to find context containing necessary information to answer the question. |
10 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
Which two metrics evaluate the âretrievalâ performance? |
Easy |
1 |
B |
Faithfulness & Answer Relevancy |
Context Precision & Context Recall |
Answer Relevancy & Context Recall |
Context Precision & Faithfulness |
Context precision and context recall evaluate retrieval performance. |
11 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
Describe the calculation process for Faithfulness in Ragas. |
Medium |
2 |
A |
Decompose answer to statements, verify against context, calculate ratio |
Generate questions, embed them, calculate cosine similarity |
Determine context relevance, calculate Precision@k, aggregate |
Decompose reference answer, verify if inferences exist in retrieved context |
The process is: Decomposition (claims), Verification (checked against context), and Scoring (ratio). |
12 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
How does Answer Relevancy determine its score technically? |
Medium |
2 |
C |
By classifying the answer using a trained classifier |
By matching keywords between answer and question |
By reverse-engineering questions from answer and calculating embedding cosine similarity |
By comparing the character count of answer vs question |
LLM generates N questions from the given answer, converts them to embeddings, and compares cosine similarity with the original question. |
13 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
A low Context Recall score means what in terms of information availability? |
Medium |
2 |
D |
The information is hallucinated |
The answer has redundant information |
The retrieved information is scattered |
The necessary facts from the reference answer are missing in the retrieved contexts |
It means the necessary information from the reference answer was not found in the retrieved contexts. |
14 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
In Context Precision calculation, what is \(v_k\)? |
Medium |
2 |
C |
Velocity of retrieval |
Volume of chunks |
Relevance indicator at position k |
Value of cosine similarity |
\(v_k \in \{0, 1\}\) is the relevance indicator at position k. |
15 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
Why might an answer score high in Faithfulness but low in Answer Relevancy? |
Medium |
2 |
B |
The answer is hallucinated but relevant |
The answer is entirely true based on context but fails to address the userâs specific question |
The retriever brought back poor context |
The context precision is very low |
It can be completely faithful to retrieved context, but that context (and answer) might not be what the user asked for. |
16 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
Why is Faithfulness strictly compared to retrieved context and not world knowledge? |
Medium |
2 |
A |
To prevent LLM hallucinations from being counted as correct if the retriever failed |
Ragas has no access to world knowledge |
The LLM doesnât know facts |
World knowledge costs more tokens |
RAGâs core value is grounding generation on specific private/provided context, so it measures adherence to that context only to prevent unaccounted hallucinations. |
17 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
If LLM splits an answer into 3 statements, and only 2 are verified in context, Faithfulness is? |
Medium |
2 |
B |
0.5 |
0.67 |
0.33 |
1.0 |
Faithfulness relies on the ratio of correct statements: 2 out of 3 makes it ~0.67. |
18 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
Given a scenario where a user asks about Einsteinâs death, but the context only contains his birth, and the LLM answers âEinstein died in 1955â using its internal knowledge. What are the RAGAS metric implications? |
Hard |
3 |
B |
High Faithfulness, Low Answer Relevancy |
Low Faithfulness, High Answer Relevancy |
Low Faithfulness, Low Context Recall |
High Context Precision, High Context Recall |
It answers the user (High Relevancy), but the claim isnât in context, making Faithfulness low. |
19 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
To improve Context Precision in a RAG pipeline, what architecture modification would you introduce? |
Hard |
3 |
C |
Increase LLM temperature |
Swap FAISS for ChromaDB |
Add a Cross-encoder reranking step |
Generate multiple answers and average them |
Reranking specifically improves the order/ranking of retrieved chunks, heavily impacting Context Precision metrics. |
20 |
Unit 1: LLMOps |
Lec2 |
RAGAS Metrics |
Detail the mathematical rationale behind using N reverse-engineered questions for calculating Answer Relevancy. |
Hard |
3 |
A |
Averages out the stochastic nature of LLMs generating questions to provide a stable semantic similarity |
It is required to satisfy vector dimensions |
One question uses up too few tokens |
N acts as a padding token for embeddings |
Generating N questions and averaging their cosine similarities mitigates the variance inherent in LLM generation, ensuring a robust relevancy score. |
21 |
Unit 2: Observability |
Lec6 |
Observability Concepts |
What is Observability in the context of LLM applications? |
Easy |
1 |
A |
The ability to track flows, errors and costs of LLM apps acting as black boxes |
A library for generating UI code |
A vector database |
The algorithm used for chunking texts |
It tracks probabilistic components acting as black boxes, aiding in tracing, tracking costs, and debugging. |
22 |
Unit 2: Observability |
Lec6 |
LangFuse Basics |
Which of these tools is known for being Open Source? |
Easy |
1 |
B |
LangChain |
LangFuse |
LangSmith |
OpenAI |
LangFuse is a popular open-source tool focusing on engineering observability. |
23 |
Unit 2: Observability |
Lec6 |
Observability Challenges |
What makes LLM applications harder to debug than traditional software? |
Easy |
1 |
C |
They use more memory |
They require internet connections |
They involve probabilistic, non-deterministic components |
They use Python |
You give input, get output. LLMs act as probabilisitic black boxes. |
24 |
Unit 2: Observability |
Lec6 |
LangSmith Basics |
Who built LangSmith? |
Easy |
1 |
B |
The LangChain Team |
OpenAI |
Meta |
LangSmith is built by the LangChain team for native integration. |
|
25 |
Unit 2: Observability |
Lec6 |
LangFuse Integration |
In LangFuse, what is used to automatically instrument LangChain chains code? |
Easy |
1 |
C |
System.out.println |
VectorEmbeddings |
CallbackHandler |
FAISS |
LangFuse provides a CallbackHandler that automatically instruments chains. |
26 |
Unit 2: Observability |
Lec6 |
Prompt Management |
Why should you manage prompts in a tool like LangFuse instead of hardcoding in Git? |
Easy |
1 |
A |
To allow non-engineers to tweak them |
Because Git is too slow |
Because Git charges per token |
To hide prompts from developers |
It acts as a CMS for prompts so non-engineers can comfortably inspect and tweak them. |
27 |
Unit 2: Observability |
Lec6 |
Setup |
How can you enable LangSmith auto-tracing in a LangChain project usually? |
Easy |
1 |
D |
Rewrite all code to use LangSmith classes |
Contact support to enable it |
Import |
Just set environment variables |
LangSmith is magic; you often donât need code changes, just environment variables. |
28 |
Unit 2: Observability |
Lec6 |
Production Best Practices |
What is the recommended tracing sampling rate for Production environments? |
Easy |
1 |
C |
100% |
50% |
1-5% of traffic |
None |
In production, tracing every request is noisy and expensive, so 1-5% or high importance traces are recommended. |
29 |
Unit 2: Observability |
Lec6 |
Privacy |
How handle PII Data Privacy before logging to a cloud observability tool? |
Easy |
1 |
B |
Do nothing |
Run PII Masking/Redaction functions |
Encrypt with simple base64 |
Delete all logs |
Never log sensitive data; run PII Masking or use enterprise redacting features. |
30 |
Unit 2: Observability |
Lec6 |
Alerts |
What is an example of a good alert to set up in observability? |
Easy |
1 |
A |
Error Rate Spike > 10% in 5 min |
âHello Worldâ printed |
CPU temperature |
Single user logged out |
You should alert on things like Error Rate > 10%, Latency Spikes, or Cost Anomalies. |
31 |
Unit 2: Observability |
Lec6 |
LangFuse vs LangSmith |
If self-hosting data privacy is an absolute requirement and budget is zero, which tool is recommended? |
Medium |
2 |
C |
Weights & Biases |
LangSmith |
LangFuse |
CloudWatch |
LangFuse is Open Source (MIT) and offers easy self-hosting (Docker Compose) for free. |
32 |
Unit 2: Observability |
Lec6 |
LangSmith Playground |
What is the âPlayground: Edit and Re-runâ feature in LangSmith useful for? |
Medium |
2 |
A |
You can take a failed production trace, change the prompt, and test a fix immediately |
Training new models |
Deploying code to AWS |
Chatting with other developers |
It allows you to take failed real-world traces and edit prompts/parameters to instantly see if the issue resolves. |
33 |
Unit 2: Observability |
Lec6 |
Latency Debugging |
If a RAG request takes 10 seconds, how does tracing help? |
Medium |
2 |
B |
It makes the query faster |
It breaks down the latency per component (e.g., Vector DB vs API completion) |
It charges the user for the wait time |
It cancels requests longer than 5 seconds |
Tracing visualizes the execution flow, pinpointing exactly which step (Vector Search vs Generate) is the bottleneck. |
34 |
Unit 2: Observability |
Lec6 |
Cost Tracking |
Why is Cost Tracking a critical feature in LLM Observability compared to traditional app monitoring? |
Medium |
2 |
D |
Because AWS charges are cheap |
Because you donât need servers |
Because LLMs donât cost real money |
Because LLM API calls are charged per-token and single runaway loops can cost hundreds of dollars quickly |
API calls are expensive, requiring real-time tracking to prevent unmanaged financial overruns. |
35 |
Unit 2: Observability |
Lec6 |
Langchain Integration |
What environment variable activates LangSmith tracing? |
Medium |
2 |
B |
LANGCHAIN_DEBUG=1 |
LANGCHAIN_TRACING_V2=true |
LANGCHAIN_LOG=all |
LANGSMITH_ACTIVE=1 |
|
36 |
Unit 2: Observability |
Lec6 |
Prompt CMS |
How do you fetch a production prompt dynamically using LangFuse SDK? |
Medium |
2 |
A |
Using |
Reading from a local .json file |
Executing a GraphQL query to Github |
Using |
Langfuse acts as a CMS and lets you retrieve prompts using |
37 |
Unit 2: Observability |
Lec6 |
Alerts & Best Practices |
Why shouldnât you just âstare at dashboardsâ for production LLM apps? |
Medium |
2 |
A |
You need automated alerts (error spikes, costs) to respond fast to anomalies |
Dashboards are always broken |
It slows down the computer |
Observability doesnât provide dashboards |
Dashboards are passive. Automated alerts are needed to actively manage sudden cost, latency, or error anomalies. |
38 |
Unit 2: Observability |
Lec6 |
Advanced LangChain Integration |
You have a complex application utilizing standard Python code, LangChain agent loops, and custom API calls. Should you prefer LangSmith or LangFuse, and why? |
Hard |
3 |
B |
LangSmith, because it supports Python natively better |
LangFuse, because it is platform-agnostic and instruments cleanly across non-LangChain code too. |
LangSmith, because LangChain is mandatory. |
LangFuse, because it has an âEdit and Re-runâ playground. |
LangFuse is platform-agnostic for non-LangChain code, making it better for mixed-stack integrations, while LangSmith is highly specific and native to LangChain execution loops. |
39 |
Unit 2: Observability |
Lec6 |
Debugging Scenarios |
In production, users report the chatbot occasionally ignores their negative feedback instructions. How would you leverage LangSmith to resolve this? |
Hard |
3 |
C |
By deleting the user history and trying again |
Check the VectorDB logs |
Locate the failed traces in LangSmith, transition them to the Playground, adjust the system prompt, and replay to verify compliance |
Re-index the FAISS database |
LangSmithâs Playground allows you to take directly failed traces, manipulate the prompt, and replay the exact trace environment to find the fix. |
40 |
Unit 2: Observability |
Lec6 |
Data Security Architecture |
Explain a robust architectural design for handling HIPAA/PII compliance while using a SaaS LLM Observability platform like LangSmith Enterprise. |
Hard |
3 |
A |
Run an edge/middleware service that performs localized PII Entity masking/redaction before transmitting traces to the LangSmith API |
Avoid observability tools completely |
Share passwords directly via the agent |
Mask PII inside the LangSmith GUI |
PII must not leave the secure perimeter; redaction must happen at the application layer or middleware before data is shipped via logs/traces. |