AI Theory Exams#

This page consolidates theory exam question banks from all AI training modules.

AI Fundamentals Theory#

Basic AI Fundamentals Quiz#

No.	Training Unit	Lecture	Training content	Question	Level	Mark	Answer	Answer Option A	Answer Option B	Answer Option C	Answer Option D	Explanation
1	Unit 1: Basic AI Fundamentals	Lec1	RAG Architecture	What is RAG (Retrieval-Augmented Generation), a hybrid AI architecture, designed to do?	Medium	1	C	Increase the speed of natural language processing	Reduce the cost of training language models	Enhance the quality and reliability of Large Language Models	Increase the creativity of language models	RAG is designed to enhance the quality and reliability of Large Language Models (LLMs) by integrating an information retrieval step from an external knowledge base before the LLM generates text.
2	Unit 1: Basic AI Fundamentals	Lec1	RAG Core Problems	What is one of the core technical problems that RAG solves?	Easy	1	A	Reduce hallucination (making up information)	Improve data retrieval speed	Increase data storage capacity	Enhance information security	RAG addresses limitations of traditional LLMs such as hallucination, outdated knowledge, lack of transparency, and difficulty accessing specialized knowledge.
3	Unit 1: Basic AI Fundamentals	Lec1	RAG vs Fine-tuning	What is the advantage of RAG over fine-tuning when updating knowledge for LLMs?	Medium	1	D	RAG is only suitable for unstructured data	RAG requires greater computing resources	RAG has lower transparency	RAG allows faster knowledge updates	RAG allows quick and nearly instant knowledge updates by updating the vector database, while fine-tuning requires retraining the model, which is expensive and slower.
4	Unit 1: Basic AI Fundamentals	Lec1	RAG Use Cases	When should you choose RAG instead of fine-tuning an LLM?	Medium	1	A	When you need to add factual knowledge and answer questions based on new data	When you need to reduce model operating costs	When you need to enhance the model’s reasoning ability	When you need to adjust the model’s behavior and style	RAG is suitable when you need to add factual knowledge and answer questions based on new data, while fine-tuning is appropriate when you need to adjust behavior, style, or learn a new skill.
5	Unit 1: Basic AI Fundamentals	Lec1	RAG Pipeline	In the RAG architecture, which phase occurs once or periodically to prepare data?	Easy	1	D	Query vectorization phase	Similarity search phase	Retrieval and answer generation phase (Retrieval Generation Online)	Data indexing phase (Indexing Offline)	The Data Indexing phase (Indexing Offline) occurs once or periodically to prepare data for RAG.
6	Unit 1: Basic AI Fundamentals	Lec1	Chunking	What is the purpose of dividing data into smaller text chunks in the ‘Load and Chunk’ step?	Easy	1	A	To ensure semantics are not lost and optimize for searching	To simplify the vectorization process	To reduce the storage capacity of data	To speed up data loading into the system	Chunking ensures that semantics are not lost and optimizes for searching.
7	Unit 1: Basic AI Fundamentals	Lec1	Vector Similarity	What is the most common method for measuring similarity between query vectors and document vectors in a Vector Database?	Medium	1	C	Manhattan distance	Jaccard similarity	Cosine Similarity	Euclidean distance	Cosine Similarity is the most common method for measuring the cosine angle between two vectors.
8	Unit 1: Basic AI Fundamentals	Lec1	RAG Online Phase	What happens to the user’s question in the first step of the ‘Retrieval and Answer Generation’ phase?	Easy	1	D	The question is stored in the database	The question is divided into smaller chunks	The question is translated to another language	The question is vectorized using an Embedding model	The user’s question is vectorized using an Embedding model.
9	Unit 1: Basic AI Fundamentals	Lec1	Embedding Quality	The quality of which component directly affects the effectiveness of the entire RAG system?	Medium	1	A	Embedding model	Similarity search method	Vector database	Prompting technique	The quality of the Embedding model directly affects the effectiveness of the entire system.
10	Unit 1: Basic AI Fundamentals	Lec1	Softmax Function	In the LLM model, what is the role of the Softmax function?	Hard	1	A	Convert scores (logits) into a probability distribution to select the most likely word	Filter out irrelevant sentences or information in text chunks	Calculate scores (logits) for all words in the vocabulary	Search for suitable text chunks	The Softmax function converts scores (logits) into a probability distribution, helping the model select the most likely word to appear.
11	Unit 1: Basic AI Fundamentals	Lec1	HyDE Technique	What is the HyDE (Hypothetical Document Embeddings) technique used for?	Hard	1	A	Expand the input query to improve retrieval results	Re-evaluate the relevance of each (question, chunk) pair	Filter out irrelevant information in text chunks	Combine the power of keyword search and vector search	HyDE uses a small LLM to generate a hypothetical document containing the answer, then uses this document’s vector for searching, improving retrieval results.
12	Unit 1: Basic AI Fundamentals	Lec1	Hybrid Search	What is Hybrid Search?	Medium	1	A	A method that combines the power of keyword search and vector search	A method that re-evaluates the relevance of each (question, chunk) pair	A method that transforms questions to improve retrieval results	A method that compresses context before putting it into the prompt	Hybrid Search combines keyword search (e.g., BM25) and vector search to achieve more comprehensive results.
13	Unit 1: Basic AI Fundamentals	Lec1	Context Compression	What is the purpose of Context Compression?	Medium	1	D	Rearrange potential candidates to select the top quality chunks	Transform input questions to improve retrieval results	Improve the accuracy of information retrieval	Reduce prompt length and help LLM focus on core information	Context Compression helps reduce prompt length and helps the LLM focus on core information by filtering out irrelevant information.
14	Unit 1: Basic AI Fundamentals	Lec1	Re-ranker	What is the role of a Re-ranker in the RAG process?	Medium	1	C	Compress text chunks to reduce prompt length	Transform the original question to improve retrieval results	Re-evaluate the relevance of each (question, chunk) pair and reorder them	Search for text chunks based on keywords	Re-ranker re-evaluates the relevance of each (question, chunk) pair and reorders them to select the top quality chunks.
15	Unit 1: Basic AI Fundamentals	Lec1	Retriever Failure	What happens if the retrieval system (retriever) does not find accurate documents in the RAG system?	Medium	1	B	The system will automatically adjust retrieval parameters to find more suitable documents	The Large Language Model (LLM) cannot answer correctly	The Large Language Model (LLM) will search for information from external sources to compensate for missing data	The Large Language Model (LLM) can still generate accurate answers based on prior knowledge	If the retriever does not find the correct documents, no matter how smart the LLM is, it cannot answer correctly.
16	Unit 1: Basic AI Fundamentals	Lec1	Lost in the Middle	What does the ‘Lost in the Middle’ syndrome in RAG systems refer to?	Hard	1	A	The tendency of LLMs to focus on information at the beginning and end of long contexts, ignoring information in the middle	Text chunks having duplicate information in the middle, causing noise in processing	Difficulty integrating LLMs in the middle of the retrieval and generation process	Delays in information retrieval when relevant documents are in the middle position in the database	When prompts contain long contexts, LLMs tend to focus only on information at the beginning and end, easily ignoring important details in the middle.
17	Unit 1: Basic AI Fundamentals	Lec1	Faithfulness Evaluation	What does ‘Faithfulness’ evaluation in RAG systems measure?	Medium	1	A	The degree to which the generated answer adheres to the provided context	The speed of processing and generating answers by the system	The relevance of the answer to the user’s question	The system’s ability to retrieve information from different sources	Faithfulness measures the degree to which the generated answer adheres to the provided context. Does the system add information on its own?
18	Unit 1: Basic AI Fundamentals	Lec1	Attention Mechanism	What role does the Attention Mechanism play in the Transformer architecture of RAG systems?	Hard	1	C	Improve the model’s parallel processing capability, helping to speed up computation	Reduce dependence on fully connected layers in the model	Allow the model to weigh the importance of different words in the input sequence for deep context understanding	Enhance the ability to encode input information into semantic vectors	The Attention Mechanism allows the model to weigh the importance of different words in the input sequence for deep context understanding.
19	Unit 1: Basic AI Fundamentals	Lec1	MRR Metric	What does the Mean Reciprocal Rank (MRR) metric measure in Retrieval Evaluation?	Hard	1	C	Measure the system’s ability to synthesize information from different sources	Measure the relevance between the question and the generated answer	Measure the position of the first correct chunk in the returned result list	Measure the percentage of questions for which the system retrieves at least one chunk containing correct answer information	Mean Reciprocal Rank (MRR) measures the position of the first correct chunk in the returned result list. The higher the position, the higher the MRR score.
20	Unit 1: Basic AI Fundamentals	Lec1	Value in RAG	In the RAG model, which element represents the actual extracted information?	Medium	1	D	Key	Query	Key vector dimension (d_k)	Value	Value represents the actual extracted information in the RAG model.
21	Unit 1: Basic AI Fundamentals	Lec1	Multimodal RAG	Which RAG development direction allows retrieving information from different types of data such as images, audio, and text?	Easy	1	A	Multimodal RAG	Internal RAG system	Agentic RAG	RAG Chatbot	Multimodal RAG allows retrieving information from different data sources, not just text.
22	Unit 1: Basic AI Fundamentals	Lec1	Agentic RAG	Which type of RAG application has the ability to ask sub-questions and interact with external tools to gather information?	Medium	1	B	Internal document RAG system	Agentic RAG	Multimodal RAG	RAG Chatbot	Agentic RAG is more proactive in gathering information by asking sub-questions and interacting with external tools.
23	Unit 1: Basic AI Fundamentals	Lec1	Enterprise RAG	Which RAG application helps employees search for information in the company’s internal documents quickly and accurately?	Easy	1	D	Multimodal RAG	Research and specialized analysis assistant	Smart customer support chatbots	Enterprise internal document RAG system	Enterprise internal document RAG systems help employees search for information quickly and accurately.
24	Unit 1: Basic AI Fundamentals	Lec1	Interactive Learning	What problem does RAG (Retrieval-Augmented Generation) application solve in interactive learning?	Medium	1	C	Limited access to learning materials	Inaccurate assessment of learning outcomes	Boredom and passivity when learning through textbooks	Lack of updated information in textbooks	RAG creates interactive tools that allow students to interact with learning materials more actively compared to reading traditional textbooks.
25	Unit 1: Basic AI Fundamentals	Lec1	Financial RAG	In the financial field, how can RAG support analysts?	Medium	1	A	Summarize and analyze risks from long financial reports	Manage personal investment portfolios	Predict stock market fluctuations	Automatically create financial reports	RAG can summarize and analyze risks from long financial reports, helping analysts save time and make decisions faster.
26	Unit 1: Basic AI Fundamentals	Lec1	E-commerce RAG	How does RAG improve product recommendation systems on e-commerce sites?	Medium	1	A	Retrieve information from detailed descriptions, product reviews, and technical specifications	Optimize product prices based on competitors	Provide 24/7 online customer support services	Enhance the ability to predict customer needs	RAG retrieves information from detailed descriptions, product reviews, and technical specifications to provide personalized recommendations, rather than relying solely on click history.
27	Unit 1: Basic AI Fundamentals	Lec1	RAG Distinctive Feature	What is the distinctive feature of RAG compared to traditional generative AI systems?	Medium	1	D	Integration with cloud platforms to increase scalability	Using the most advanced deep learning algorithms	Ability to automatically adjust parameters to optimize performance	Combining the deep language capabilities of LLMs with the accuracy of external knowledge bases	RAG combines the language capabilities of LLMs with the accuracy and up-to-date nature of external knowledge bases, creating more reliable and transparent AI applications.
28	Unit 1: Basic AI Fundamentals	Lec1	Vector Database	What is the primary purpose of a Vector Database in a RAG system?	Easy	1	B	Store raw text documents for quick retrieval	Store and efficiently search through vector embeddings	Manage user authentication and access control	Cache frequently asked questions and answers	A Vector Database is specifically designed to store and efficiently search through vector embeddings, enabling fast similarity searches in the RAG pipeline.
29	Unit 1: Basic AI Fundamentals	Lec1	Chunking Strategies	Which chunking strategy maintains the logical structure of a document by splitting at natural boundaries?	Medium	1	C	Fixed-size chunking	Random chunking	Semantic chunking	Overlapping chunking	Semantic chunking splits documents at natural boundaries (paragraphs, sentences, sections) to maintain logical structure and preserve meaning within each chunk.
30	Unit 1: Basic AI Fundamentals	Lec1	Top-K Retrieval	What does the ‘Top-K’ parameter control in RAG retrieval?	Easy	1	A	The number of most similar documents to retrieve	The maximum length of each chunk	The threshold for similarity scores	The number of re-ranking iterations	Top-K parameter controls how many of the most similar documents are retrieved from the vector database to provide context for the LLM.
31	Unit 1: Basic AI Fundamentals	Lec1	Prompt Engineering	In RAG systems, what is the role of the system prompt when generating answers?	Medium	1	B	To store retrieved documents permanently	To instruct the LLM on how to use the retrieved context to generate answers	To perform the similarity search in the vector database	To convert user queries into embeddings	The system prompt instructs the LLM on how to use the retrieved context to generate accurate, grounded answers and may include formatting guidelines and constraints.
32	Unit 1: Basic AI Fundamentals	Lec1	Answer Relevance	What does ‘Answer Relevance’ measure in RAG evaluation?	Medium	1	C	How fast the system generates responses	The accuracy of the embedding model	How well the generated answer addresses the user’s original question	The number of retrieved documents used	Answer Relevance measures how well the generated answer addresses the user’s original question, ensuring the response is pertinent and useful.
33	Unit 1: Basic AI Fundamentals	Lec1	Context Window	What limitation does the ‘context window’ impose on RAG systems?	Hard	1	D	The maximum number of documents that can be stored	The time limit for generating responses	The minimum similarity score for retrieval	The maximum amount of text that can be processed by the LLM at once	The context window limits the maximum amount of text (retrieved chunks + query + system prompt) that can be processed by the LLM at once, requiring careful management of chunk sizes.
34	Unit 1: Basic AI Fundamentals	Lec1	Metadata Filtering	What is the benefit of using metadata filtering in RAG retrieval?	Medium	1	A	Narrow down search results based on document attributes before semantic search	Increase the size of the vector database	Speed up the embedding generation process	Reduce the cost of LLM API calls	Metadata filtering allows narrowing down search results based on document attributes (date, source, category) before or during semantic search, improving retrieval precision.
35	Unit 1: Basic AI Fundamentals	Lec1	Hallucination Prevention	Which technique helps prevent hallucination in RAG systems by ensuring answers are grounded in retrieved content?	Hard	1	B	Increasing the temperature parameter	Instructing the LLM to only use information from the provided context	Using larger embedding dimensions	Reducing the Top-K value to 1	Instructing the LLM through the system prompt to only use information from the provided context and to say “I don’t know” when information is not available helps prevent hallucination and ensures answers are grounded in retrieved content.

RAG Optimization Theory#

Exam Theory: RAG and Optimization#

This exam theory focuses on assessing advanced topics within Retrieval-Augmented Generation (RAG) and its optimization techniques, drawing specifically from Advanced Indexing, Hybrid Search, Query Transformation, Post-Retrieval Processing, and GraphRAG Implementations.

No.	Training Unit	Lecture	Training content	Question	Level	Mark	Answer	Answer Option A	Answer Option B	Answer Option C	Answer Option D	Explanation
1	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	What is a major disadvantage of fixed-size chunking when applied to large amounts of documents?	Easy	1	A	It causes a loss of semantics by breaking ideas arbitrarily.	It is too computationally expensive.	It prevents vector search from indexing numbers.	It requires advanced linguistic models to parse.	Mechanical chunking accidentally breaks the flow of the text, making the LLM unable to understand the context when an idea is arbitrarily split.
2	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	Why does Brute-force Flat Indexing become a serious problem as a system scales?	Easy	1	B	It consumes too much disk space.	It causes high latency when sequentially scanning millions of vectors.	It is incompatible with neural network architectures.	It only supports English text.	Sequentially scanning through millions of vectors in a Flat Index is too slow to meet real-time requirements.
3	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	What is the core idea driving Semantic Chunking?	Medium	1	C	To chunk text strictly by paragraph breaks.	To split texts after exactly 1000 characters.	To detect shifts to a new topic and perform a break precisely at the intersection of two topics.	To summarize the text before splitting it.	Semantic Chunking detects when sentences or content shift to a new topic (when vector direction abruptly changes) to perform a break.
4	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	What metric is typically calculated between consecutive sentences during Semantic Chunking?	Medium	1	A	Cosine similarity	Word count ratio	Token frequency	Character limits	In Semantic Chunking, the similarity (for example cosine similarity) is calculated between the current sentence and the next one.
5	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	In Semantic Chunking, when does the algorithm decide to split the text?	Medium	1	D	When similarity is above 90%.	After a fixed number of punctuation marks.	When the sentence length exceeds the threshold.	When similarity drops significantly below a threshold.	If similarity drops significantly below the threshold, it means the topic has changed, breaking the chunk there.
6	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	What is a notable advantage of Semantic Chunking over Recursive Chunking?	Medium	1	B	It runs extremely fast.	It preserves ideas fully and perfectly follows the flow of text.	It does not consume any computational resources.	It is specifically designed for codebases.	Semantic Chunking preserves ideas fully, strictly follows the text flow, and increases accuracy when searching.
7	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	What is a major disadvantage of Semantic Chunking?	Easy	1	C	It cuts through important ideas frequently.	It returns very noisy contexts.	It consumes computational resources due to running a model to compare each sentence.	It only works for legal or contract documents.	Because it must run an ML model to compare the similarity of each consecutive sentence, it consumes computational resources.
8	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	What does HNSW stand for in the context of Vector Databases?	Easy	1	A	Hierarchical Navigable Small World	High Neural State Weights	Heuristic Node Searching Window	Hierarchical Numeric Sequence Word	HNSW stands for Hierarchical Navigable Small World, an effective algorithm balancing retrieval speed and accuracy.
9	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	What kind of data structure does HNSW organize data into?	Medium	1	C	A flat SQL table	A chronological file system	A multi-layered graph structure	A raw byte stream	HNSW organizes data in the form of a multi-layered graph structure utilizing short and long shortcut links.
10	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	In HNSW, what is the role of Layer 0?	Medium	1	D	It contains the shortest summary of the dataset.	It stores the sparse shortcut links.	It is empty and serves as a placeholder.	It contains all data points and the most detailed links between them.	Layer 0 contains all data points, and the most detailed links. It contains the most complete information to find the exact target.
11	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	What does parameter `M` (Max Links per Node) dictate in HNSW?	Hard	1	A	The maximum number of links a node can create with neighbor nodes.	The memory limit in megabytes.	The number of documents returned.	The margin of error allowed.	M specifies the maximum number of links a node can create with other neighbor nodes. The larger M is, the denser the network.
12	Unit 1: RAG and Optimization	Lec 1	Advanced Indexing	How should `ef_search` be configured for a real-time Chatbot application?	Hard	1	B	It should be set to 0.	It should be kept at a low level (e.g., 50-100) to optimize latency.	It should be set to maximum allowed bounds.	It should equal the total number of documents.	Keeping `ef_search` at a low level optimizes the system response time for a chatbot where small error margins are acceptable in favor of speed.
13	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	What is an inherent weakness of standard Vector Search?	Easy	1	C	It lacks speed when processing basic synonyms.	It struggles with multilingual queries.	It reveals weaknesses when encountering queries requiring absolute accuracy in wording.	It ignores document meaning entirely.	Vector Search reveals weaknesses when processing queries requiring absolute accuracy (e.g., proper names, error codes).
14	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	What exactly constitutes a Hybrid Search mechanism?	Easy	1	A	Combining the power of semantic vector search with traditional keyword search.	Merging structured and unstructured relational databases.	Running two identical LLMs simultaneously.	Compiling queries in both Python and Java.	Hybrid search combines semantic search (Vector) and traditional keyword search (BM25).
15	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	Which keyword frequency-based statistical algorithm is standard for Hybrid Search?	Easy	1	D	BERT	HNSW	HyDE	BM25	BM25 is the gold standard for traditional keyword retrieval algorithms in Hybrid Search.
16	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	How does BM25 solve the keyword spamming problem found in TF-IDF?	Medium	1	B	By manually blacklisting frequent spammers.	By applying a saturation mechanism where scoring asymptotes after several keyword occurrences.	By analyzing the semantic meaning of repetitive words.	By deleting any document that repeats a word.	BM25 applies a saturation mechanism so that appearing a 101st time hardly adds more score than the 10th time.
17	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	What does Inverse Document Frequency (IDF) do in the BM25 formula?	Medium	1	A	It penalizes common words and massively rewards rare words.	It ranks shorter documents higher than longer ones.	It limits the number of query words sent to the server.	It inverses the vectors created by the model.	IDF penalizes common words heavily while attributing more importance and score weight to rare words.
18	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	Why is Length Normalization an important feature of BM25?	Medium	1	C	It forces all documents to be exactly 1000 characters.	It compresses long queries to save bandwidth.	A single keyword in a short paragraph gets rated higher than the same keyword diluted in a long novel.	It converts all characters to lowercase.	BM25 scales the score based on document length to prevent long documents from unfairly dominating over concise information.
19	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	In a typical Hybrid Search pipeline, how are the two algorithms executed?	Medium	1	D	Vector search completes first, then BM25 is run on the results.	BM25 runs entirely locally before running Vector remotely.	Only one is executed depending on a query classifier.	They are executed in parallel simultaneously.	The system sends the query simultaneously to both search engines (Parallel Execution).
20	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	Why can’t we simply add the BM25 score and the Vector Search score together?	Hard	1	B	Vector search scores are negative integers.	The scoring scales are fundamentally different (Vector uses [0, 1] cosine similarity; BM25 is arbitrary positive numbers).	They are processed on different neural network architectures.	BM25 produces alphabetical grading ranges.	The scoring scales of the two algorithms are completely different and numerically incompatible directly.
21	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	What algorithm solves the score compatibility issue in Hybrid Search?	Medium	1	C	GraphRAG Convolution	Maximal Marginal Relevance	Reciprocal Rank Fusion (RRF)	TF-IDF Smoothing	Reciprocal Rank Fusion (RRF) merges these two lists effectively.
22	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	Upon what theoretical basis does Reciprocal Rank Fusion (RRF) operate?	Hard	1	A	Instead of scores, it assumes that if a document appears at a high rank in both lists, it is certainly important.	It averages the raw text chunks of both documents.	It only accounts for the longest document.	It uses an LLM to assign arbitrary ranks.	RRF cares about rank rather than score; a high consensus of rank across disparate algorithms signifies an important document.
23	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	What is the purpose of the smoothing constant `k` within the RRF formula?	Hard	1	D	It identifies the number of total documents in the database.	It sets the maximum allowed token count.	It determines the strictness of exact keyword matching.	It helps reduce score disparity between very high ranks, ensuring fairness.	The constant \(k\) (usually 60) reduces massive score disparities between adjacent high ranks (like Top 1 vs Top 2), ensuring a smoother gradient of rank scoring.
24	Unit 1: RAG and Optimization	Lec 2	Hybrid Search	What does Hybrid Search primarily sacrifice to gain balanced Context and Keyword accuracy?	Easy	1	B	Security and Privacy	System resources, as it is complex to deploy and consumes resources running 2 parallel streams.	API documentation clarity	Multi-lingual support	Hybrid Search is more complex to deploy and consumes more resources due to running parallel streams simultaneously.
25	Unit 1: RAG and Optimization	Lec 3	Query Transformation	Why do raw user questions often yield poor Vector Search results natively?	Easy	1	C	LLMs cannot read unformatted text.	Vector databases reject single words.	Questions are short/interrogative, lacking context compared to long descriptive documents.	Search algorithms intentionally delay short queries.	Vector Search faces semantic asymmetry; questions are short and interrogative while documents are long and descriptive.
26	Unit 1: RAG and Optimization	Lec 3	Query Transformation	What is the core idea of Query Transformation?	Easy	1	A	Using an LLM to rewrite, expand, or break down the user’s question into better versions before searching.	Encrypting user queries before transmission.	Replacing semantic searches with strict SQL SELECT queries.	Running the user’s prompt through a grammar checker.	It uses an LLM to intelligently edit, expand, or rewrite poor raw queries before sending them to the lookup department.
27	Unit 1: RAG and Optimization	Lec 3	Query Transformation	What does HyDE stand for in Query Transformation?	Medium	1	B	Heavy Yield Database Execution	Hypothetical Document Embeddings	Hybrid Y-axis Dense Encapsulation	Hex-layered Data Encryption	HyDE stands for Hypothetical Document Embeddings.
28	Unit 1: RAG and Optimization	Lec 3	Query Transformation	What happens during the “Generate” phase of a HyDE strategy?	Medium	1	D	It generates Python scripts.	It generates a dense vector representing the question.	It generates an index mapping inside the SQL table.	The system asks the LLM to write a hypothetical answer paragraph for the user’s question.	The LLM is forced to draft a fake, hypothetical answer for the question so it matches the expected document vocabulary.
29	Unit 1: RAG and Optimization	Lec 3	Query Transformation	Does the hypothetical “fake answer” drafted in HyDE need to be factually correct?	Hard	1	B	Yes, exact factual accuracy guarantees precise matches.	No, but the writing style and technical vocabulary should resemble the actual document.	Yes, the model refuses to output hallucinated responses.	No, it just generates a sequence of random numbers.	The information in the paragraph might be factually incorrect, but its style and technical vocabulary mimic real documents to enable better semantic matching.
30	Unit 1: RAG and Optimization	Lec 3	Query Transformation	Why is the vector generated from the “fake answer” in HyDE more useful than the user’s question vector?	Medium	1	A	The fake answer vector is semantically closer to the real document vector than the short interrogative question vector.	It consumes 0 RAM.	It maps perfectly to sparse BM25 arrays.	The user’s query vector is permanently deleted.	The drafted answer contains similar sentence structures/buzzwords to real documents, closing the asymmetric semantic gap.
31	Unit 1: RAG and Optimization	Lec 3	Query Transformation	When is the Query Decomposition strategy particularly useful?	Medium	1	C	When querying single words.	When parsing simple FAQ menus.	When a question requires comparing or aggregating information from multiple independent scattered sources.	When reading codebases in completely unknown programming languages.	It handles complex multi-intent questions comparing or gathering data from multiple sources where a single text snippet fails to contain the whole answer.
32	Unit 1: RAG and Optimization	Lec 3	Query Transformation	What happens during the first phase (Breakdown) of Query Decomposition?	Medium	1	A	The LLM analyzes the original question and splits it into a sequence of separate independent sub-questions.	The system shreds the database documents into chunks.	The LLM provides the final answer immediately without searching.	The database is partitioned across multiple distinct servers.	The system identifies multi-intent questions and logically breaks them into single-intent targeted sub-questions.
33	Unit 1: RAG and Optimization	Lec 3	Query Transformation	How does Query Decomposition run searches for multiple sub-questions?	Medium	1	B	It merges all sub-questions back into one query.	It performs standard document searches individually for each separate sub-question.	It relies exclusively on cached external queries.	It skips queries containing conjunctions.	It executes distinct targeted retrieval queries for every identified independent sub-question.
34	Unit 1: RAG and Optimization	Lec 3	Query Transformation	Which phase of Query Decomposition requires the LLM to process text found from all separate sub-searches?	Easy	1	C	Breakdown	Encapsulation	Synthesis	Verification	In Synthesis, text segments found from all previous distinct steps are aggregated and fed into the LLM to form a complete final answer.
35	Unit 1: RAG and Optimization	Lec 3	Query Transformation	In summary, what role does Query Transformation act as?	Easy	1	D	An internet firewall proxy.	A database administrator deleting old records.	A compiler translating queries to binary.	An intelligent editor reorienting questions to ensure the system correctly understands true intent.	It performs intelligent preprocessing (via drafting or splitting) so concise or poor user queries execute properly against the technical index.
36	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	Why is the Top-K list returned directly from standard retrievers often suboptimal for an LLM?	Medium	1	A	Standard embedding models trade deep semantic accuracy for retrieval speed, and may return contextually incorrect “noisy” keyword matches.	The returned list is usually empty.	The standard top-K size is too large for modern hardware.	The returned documents are always translated to a random language.	Embedding models heavily prioritize index speed over complex relationship comprehension, often returning documents with matching keywords but wrong contextual intents.
37	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	What represents the main goal of Re-ranking in a RAG pipeline?	Easy	1	C	To randomly shuffle the document list.	To format the output HTML for the frontend.	To act as a final filter processing a small pool of candidates to pick the absolutely best ones.	To permanently alter the dataset ordering.	Re-ranking takes a small pool (like 50) and spends extra computational time reading them carefully to pick the top 5 highest-quality documents.
38	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	What architectural method do standard Embedding Models use during the Retrieval step?	Medium	1	B	Graph-Encoder	Bi-Encoder	Cross-Encoder	Recursive-Encoder	Retrieval embeddings process questions and documents separately via Bi-Encoders.
39	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	What is the major pros and cons of the Bi-Encoder architecture?	Hard	1	A	Fast speed (via pre-computation), but loses detailed nuanced interaction information between question and document words.	Extreme accuracy, but consumes too much API quota.	Perfectly handles complex negations, but fails at simple keywords.	It guarantees data privacy, but prevents external web searches.	Because the vectors are calculated independently ahead of time, it runs fast but misses deeper interrelated context (like negations vs subjects).
40	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	How does a Cross-Encoder fundamentally differ from a Bi-Encoder?	Hard	1	D	It translates everything into Spanish.	It maps vectors onto a graph database exclusively.	It bypasses the attention mechanism entirely.	The question and document are concatenated into a single text sequence, processed simultaneously via a full Self-Attention mechanism.	Instead of separated outputs, Cross-Encoders read both strings concurrently to understand complex logic, negation, and interactions between all words simultaneously.
41	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	If Cross-Encoders are incredibly accurate, why don’t we use them to search the entire database?	Medium	1	C	They cannot run on GPUs.	They only output integers.	They are very slow and resource-consuming to run across millions of documents.	They are blocked by vector database protocols.	Processing millions of documents concurrently through strict Self-Attention is too computationally slow.
42	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	What describes the Funnel Strategy in Post-Retrieval?	Medium	1	B	Running Bi-Encoder and Cross-Encoder on separate clusters entirely.	Using Bi-Encoder to fast-retrieve a Top 50, then using Cross-Encoder to slowly re-score those 50 into a Top 5.	Splitting documents into smaller funnels based on character limits.	Re-ranking the vector database before queries arrive.	The funnel strategy accepts speed from Bi-Encoders (for finding 50 items) and precision from Cross-Encoders (for filtering to 5).
43	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	In scenarios dealing with biological negation (e.g., “What does Python NOT eat”), why does a Cross-Encoder succeed where a Bi-Encoder fails?	Hard	1	A	The Cross-Encoder recognizes the negation structure and biological context perfectly since it reads the query and document concurrently.	The Cross-Encoder has a specialized biology database pre-installed.	The Bi-Encoder deletes the word “NOT”.	The Cross-Encoder ignores keywords entirely.	Bi-Encoders mistakenly link the keywords “Python” and “eat”, while Cross-Encoders accurately recognize the negation modifier mapping to the biological logic.
44	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	What does MMR stand for in the context of Post-Retrieval processing?	Medium	1	D	Minimum Marginal Rating	Multi-Model Retrieval	Memory Mapping Resolution	Maximal Marginal Relevance	MMR stands for Maximal Marginal Relevance, an algorithm used to diversify query results.
45	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	What twofold problem does MMR aim to solve when selecting final documents?	Medium	1	B	Size vs Compression	Relevance to the query vs Diversity to prevent identical redundant documents.	API Latency vs Local Storage	Token allowance vs Security constraints	When similarity returns 5 identical paragraphs of text, MMR resolves the redundancy by ensuring selected documents are relevant but distinctly diverse.
46	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	In the MMR algorithm, what occurs after picking the most similar document (Step 1)?	Hard	1	C	The system clears the cache.	The system returns immediately.	It finds the next document similar to the query but least similar to previously selected documents.	It picks the document that is completely irrelevant to the query.	Step 2 balances relevance by filtering for the next document containing the query’s answer but differing heavily from the document already selected.
47	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	In the MMR optimization formula, what does lowering lambda (\(\lambda\)) do?	Hard	1	A	Priorities diversity by increasing the penalty for selecting text similar to existing selected documents.	Causes the system to crash.	Forces exact keyword matching.	Elevates relevance entirely over diversity.	Decreasing lambda gives more mathematical priority to the diversity penalty section of the MMR formula, forcing varied information.
48	Unit 1: RAG and Optimization	Lec 4	Post-Retrieval	If a user asks a broad question (“Features of VF8 Car”) and wants comprehensive overall coverage, which Re-ranker is optimal?	Medium	1	C	Flat Indexing	Recursive Chunking	Maximal Marginal Relevance (MMR)	Simple Bi-Encoder similarity	MMR guarantees diverse, non-redundant documents giving the LLM text detailing multiple broad vehicle features, not just repeated text about its engine.
49	Unit 1: RAG and Optimization	Lec 5	GraphRAG	What does GraphRAG combine to create a comprehensive knowledge representation system?	Easy	1	B	Cloud storage and Edge devices	Structured graph databases with vector-based retrieval	Dense and Sparse chunking limits	Hybrid APIs and NoSQL mappings	GraphRAG merges structured graph DBs (like Neo4j) and vector retrieval.
50	Unit 1: RAG and Optimization	Lec 5	GraphRAG	What popular graph database is used for storing GraphRAG entities in the implementation example?	Easy	1	A	Neo4j	PostgreSQL	ElasticSearch	MongoDB	Neo4j is utilized to construct and store the nodes and relationship graphs.
51	Unit 1: RAG and Optimization	Lec 5	GraphRAG	What is the purpose of Pydantic models in the implementation pipeline?	Medium	1	D	To render the Neo4j visualization frontend.	To manage API timeout failures.	To download PDF files correctly.	To enforce validation schemas for structured entity/relationship output from the LLM.	Pydantic classes like `PolicyClauseExtraction` compel the LLM to output consistent, strictly validated object types representing entities.
52	Unit 1: RAG and Optimization	Lec 5	GraphRAG	According to the implementation extraction rules, what constitutes a “commitment”?	Medium	1	C	Simple definitions and jargon.	Any sentence ending in a period.	A clear promise, obligation, or prohibition found in the text.	A numeric calculation executed by the CPU.	The LLM is instructed to identify clear promises, obligations, or prohibitions as Commitments.
53	Unit 1: RAG and Optimization	Lec 5	GraphRAG	How are measurable numeric limits inside obligations handled during extraction?	Hard	1	D	They are discarded mathematically.	They are summed together.	They are sent to a calculator API.	They are explicitly extracted as Constraint unit parameters.	If a commitment contains numeric limits, the agent extracts them strictly as linked Constraints.
54	Unit 1: RAG and Optimization	Lec 5	GraphRAG	What does the `.with_structured_output(PolicyClauseExtraction)` method achieve in LangChain?	Medium	1	A	Forces the LLM to reply via JSON adhering precisely to the Pydantic schema class.	Translates the output into Neo4j graph visualizations natively.	Prevents the model from reading files.	Outputs Python code running in a sandbox.	It guarantees the unstructured text processed by the ChatGPT API is accurately deserialized back into structured `PolicyClauseExtraction` objects.
55	Unit 1: RAG and Optimization	Lec 5	GraphRAG	In the designed graph schema, what do `PolicyClause` nodes specifically track?	Easy	1	C	The user identities processing the data.	The hardware metrics.	The overarching policy topics/units from chunked texts.	The exact numeric values from commitments.	PolicyClause nodes store the actual chunked policy texts/topics serving as central nodes linking other entities.
56	Unit 1: RAG and Optimization	Lec 5	GraphRAG	In Cypher (Neo4j), which operation ensures duplicate nodes are not created during ingestion?	Medium	1	B	INSERT IGNORE	MERGE	UPSERT	ADD DISTINCT	Using the `MERGE` query checks existence before inserting, preventing duplicated nodes.
57	Unit 1: RAG and Optimization	Lec 5	GraphRAG	How are `Stakeholder` nodes structurally linked in the Neo4j graph?	Hard	1	A	Via the `AFFECTS` relationship incoming from the `PolicyClause` node.	Via a standalone `IS_A` class instance mapping.	Via `CONTAINS` relationships stemming from `Regulation` nodes.	They are completely unlinked.	Stakeholder nodes reflect affected parties, mapped using `[:AFFECTS]` from the PolicyClause.
58	Unit 1: RAG and Optimization	Lec 5	GraphRAG	What represents a distinct advantage of GraphRAG over standard vector similarity search?	Medium	1	B	It consumes zero system memory.	Relationships explicitly define how entities connect, solving queries needing context-aware traversal mapping.	It requires no chunking.	It automatically resolves grammatical mistakes.	Graph traversal natively exposes how discrete entities explicitly connect, answering intricate logical queries that vector distances alone cannot deduce.
59	Unit 1: RAG and Optimization	Lec 5	GraphRAG	Which LangChain module converts natural language into Cypher queries for the LLM?	Medium	1	A	GraphCypherQAChain	VectorDBQAChain	PydanticOutputParser	DocumentConverter	`GraphCypherQAChain` converts English questions into Cypher code capable of traversing the graph structure.
60	Unit 1: RAG and Optimization	Lec 5	GraphRAG	What is noted as a core limitation or consideration when implementing GraphRAG?	Medium	1	D	It deletes all prior indexes upon restart.	It requires user authentication before every search.	The LLM must be hosted locally.	It relies heavily on specific types of structured data linking to form an effective knowledge base.	GraphRAG’s power originates strictly from highly structured data mappings; mapping unstructured erratic data yields poor relationships.

LangGraph and Agentic AI Theory#

Final Exam#

No.	Training Unit	Lecture	Training content	Question	Level	Mark	Answer	Answer Option A	Answer Option B	Answer Option C	Answer Option D	Explanation
1	LangGraph & Agentic AI	Lec1	State Management	What is the core field used for ALL input/output from nodes in a LangGraph State?	Easy	1	C	`context`	`history`	`messages`	`state_vars`	The `messages` field is the core channel for all conversational I/O between nodes in LangGraph.
2	LangGraph & Agentic AI	Lec1	State Management	Which concept allows LangGraph to support complex workflows compared to standard LangChain chains?	Easy	1	B	Linear flows only	Cyclic flows and conditional routing	Stateless operations	Basic sequential pipelines	Extends basic chains with cyclic flows and conditional routing for loops / complex logic.
3	LangGraph & Agentic AI	Lec1	State Management	What is the role of `add_messages` reducer in a TypedDict State?	Easy	1	A	Appending new messages and handling deduplication	Deleting old messages automatically	Summarizing long conversations	Replacing the current message list with a new one	`add_messages` automatically appends new messages and handles deduplication via message IDs.
4	LangGraph & Agentic AI	Lec1	State Management	Which of the following is NOT a standard LangChain message type used in LangGraph?	Easy	1	D	`AIMessage`	`HumanMessage`	`ToolMessage`	`DataMessage`	Standard types are `AIMessage`, `HumanMessage`, `SystemMessage`, `ToolMessage`. `DataMessage` is not standard.
5	LangGraph & Agentic AI	Lec1	State Management	In LangGraph’s State structure, what should non-conversational context like `user_id` or `max_iterations` be used for?	Easy	1	B	Sent directly to the LLM response	Storing configuration and metadata	Replacing the standard message history	Caching LLM tokens	Context fields are meant for metadata and configuration, not standard I/O messages.
6	LangGraph & Agentic AI	Lec1	State Management	Which object serves as the core director engine orchestrating LLM workflows in LangGraph?	Easy	1	D	`MessageGraph`	`GraphPipeline`	`WorkflowGraph`	`StateGraph`	`StateGraph` is the core class orchestrating directed graph workflows based on state.
7	LangGraph & Agentic AI	Lec1	State Management	How does LangGraph handle context injection before starting the graph execution?	Medium	1	C	By loading it from an external JSON file automatically.	By sending a special `SystemMessage` at the end of the conversation.	By initializing the state with context variables when calling `app.invoke(initial_state)`.	Context cannot be injected; the LLM must generate it.	Context is provided to `app.invoke()` alongside initial messages.
8	LangGraph & Agentic AI	Lec1	State Management	When building a multi-agent system, how do different agents (nodes) share findings with one another in a messages-centric pattern?	Medium	1	A	By appending `AIMessage` tagged with their `name` to the group’s `messages` list.	By modifying the global `context` object directly.	By resetting the `messages` list every time an agent switches.	By sending direct peer-to-peer API calls bypassing the state.	Agents append named `AIMessage`s to the shared state’s `messages` list.
9	LangGraph & Agentic AI	Lec1	State Management	What is the primary purpose of adding nodes and edges to a `StateGraph` object?	Medium	1	D	To train a new deep learning model.	To clean the data before input into a LangChain chain.	To replace the standard LLM reasoning layers.	To map out functions as nodes and execution paths as edges.	Nodes represent functions/agents; edges dictate the workflow paths and conditionals.
10	LangGraph & Agentic AI	Lec1	State Management	If an LLM node returns `{"messages": [AIMessage("Hello")]}` without the `add_messages` reducer setup, what happens to the state?	Medium	1	B	It merges the new message safely.	It overwrites the existing message list.	It throws a syntax error.	It drops the message entirely.	Without a reducer like `add_messages`, standard dictionary update behavior would overwrite the list rather than append.
11	LangGraph & Agentic AI	Lec1	State Management	According to LangGraph Best Practices, why should conversational data (I/O) be kept strictly in `messages` while keeping context fields separate?	Hard	1	B	Because LangChain parsers crash if state contains integers.	It enables robust State Persistence (Checkpointers) which rely on deterministic, append-only message histories.	It saves tokens directly since context fields are automatically hidden from the LLM.	Context fields are only valid in the `END` node.	Checkpointers reconstruct and replay the state efficiently when conversational history relies on the standardized, append-only messages slice.
12	LangGraph & Agentic AI	Lec1	State Management	How can conditional routing leverage the State to decide whether to call a tool or end the workflow?	Hard	1	A	By inspecting `state["messages"][-1]` to check for `tool_calls` attributes.	By manually polling an external database at every node.	By counting the number of characters in the previous `AIMessage`.	By throwing an exception when the state is exhausted.	The conditional edge function looks at the last message to see if the LLM populated `tool_calls`.
13	LangGraph & Agentic AI	Lec2	Agentic Patterns	What does the ReAct pattern stand for in agentic workflows?	Easy	1	B	Refresh and Activate	Reason and Act	Respond and Acknowledge	Request and Action	ReAct combines explicit reasoning (Think) before acting (Tool Use) in a loop.
14	LangGraph & Agentic AI	Lec2	Agentic Patterns	Why is a Multi-Expert pattern generally preferred over a single generic web search tool for complex research?	Easy	1	A	It provides specialized domain knowledge and structured reasoning.	It uses fewer tokens.	It operates completely offline.	It requires zero prompt engineering.	Specialized LLMs acting as tools provide better domain insights and consistent reasoning.
15	LangGraph & Agentic AI	Lec2	Agentic Patterns	What is the purpose of the `ToolNode` in LangGraph?	Easy	1	D	To prompt the LLM to generate code.	To browse the internet using a headless browser.	To compress message history.	To automatically handle the parsing and execution of multiple tools.	`ToolNode` automatically executes the tools called by the LLM and formats them as `ToolMessage`s.
16	LangGraph & Agentic AI	Lec2	Agentic Patterns	In a ReAct loop, what is the sequence of steps the coordinator LLM usually follows?	Easy	1	C	Act \(\to\) Think \(\to\) Stop	Observe \(\to\) Act \(\to\) Think	Think \(\to\) Act \(\to\) Observe	Stop \(\to\) Observe \(\to\) Think	The standard ReAct loop is: Think (Reason), Act (Call Tool), Observe (Tool Result), and Repeat.
17	LangGraph & Agentic AI	Lec2	Agentic Patterns	What is a common way to prevent an agent from getting trapped in an infinite ReAct loop?	Easy	1	B	Disabling all tools permanently.	Adding an `iteration_count` field in State and routing to `END` when a limit is reached.	Forcing the LLM to answer in 10 words or less.	Unplugging the server.	Checking an iteration limit in the conditional edge is best practice to stop runaway loops.
18	LangGraph & Agentic AI	Lec2	Agentic Patterns	How do Multi-Expert Tools differ technically from standard external API tools (like web search) inside a LangGraph setup?	Easy	1	C	They don’t use the `@tool` decorator.	They execute JavaScript code.	They are themselves LLM invocations with specialized system prompts.	They bypass the `messages` state entirely.	Expert tools invoke another instance of an LLM primed with a specific expert persona.
19	LangGraph & Agentic AI	Lec2	Agentic Patterns	If an agent is deciding which expert to call during the “Act” phase, what enables the LLM to provide structured function calls automatically?	Medium	1	B	Regular Expressions parsing.	Using `llm.bind_tools([expert1, expert2])`.	Writing manual JSON format instructions in the prompt.	Training a custom fine-tuned router model.	`bind_tools()` maps the tool schema natively to the LLM’s function-calling capabilities.
20	LangGraph & Agentic AI	Lec2	Agentic Patterns	What is the main architectural upgrade introduced when adding a Planning Agent to a simple ReAct flow?	Medium	1	A	The Coordinator is relieved of analyzing the user’s initial message; a separate Planner handles decomposition first.	Tools are executed synchronously without LLM intervention.	The agent switches to using a completely different model provider.	State management is no longer required.	A Planner separates the complex task of understanding and task decomposition from the execution/coordinator task.
21	LangGraph & Agentic AI	Lec2	Agentic Patterns	During the “Observe” phase of standard ReAct with Langgraph `ToolNode`, what specific message object is appended to the state?	Medium	1	D	`SystemMessage`	`AIMessage`	`FunctionMessage`	`ToolMessage`	After executing a tool, `ToolMessage`s containing the tool output are returned to the state.
22	LangGraph & Agentic AI	Lec2	Agentic Patterns	What happens if multiple expert tools are called simultaneously by the Coordinator LLM?	Medium	1	B	They are ignored and skipped.	The `ToolNode` executes them in parallel and returns all their `ToolMessage`s.	The graph crashes due to a concurrency error.	Only the first tool is executed.	Modern models can return multiple tool calls at once, which `ToolNode` handles naturally by executing them and appending all results.
23	LangGraph & Agentic AI	Lec2	Agentic Patterns	In a robust production-ready Multi-Expert Research agent, how should tool execution failures be handled?	Hard	1	D	By shutting down the LangGraph server.	By letting the unhandled exception crash the application so developers can debug.	By automatically switching model providers mid-workflow.	By catching the exception inside the tool or custom node and returning a `ToolMessage` stating the error, so the LLM can try a fallback.	Returning the error as a string message allows the Coordinator LLM to “Reason” about the failure and take alternative action.
24	LangGraph & Agentic AI	Lec2	Agentic Patterns	Why does a Multi-Expert ReAct pattern consume significantly more tokens than a simple linear agent?	Hard	1	C	Because it stores all memory in a vector database.	Because LangGraph adds a large metadata overhead to every variable.	The complete conversation history (`messages` list) including all intermediate reasoning and tool outputs must be sent back to the LLM upon every iteration.	Because expert LLMs generate longer responses to simple questions.	In ReAct loops, the context window GROWS each cycle as new `AIMessage` and `ToolMessage` entities are appended and fed back entirely during the next loop.
25	LangGraph & Agentic AI	Lec3	Tool Calling	What is the main difference between traditional LLM prompts and Tool Calling capabilities?	Easy	1	D	Prompts use more tokens.	Tool Calling avoids external APIs.	Tool Calling is only available in open-source models.	Tool Calling enables the model to issue structured JSON parameters to invoke external code automatically.	Structural return formats from the LLM via defined JSON schemes is the core innovation in Tool Calling.
26	LangGraph & Agentic AI	Lec3	Tool Calling	Which terminology specifically refers to OpenAI’s native API parameter for passing a JSON schema?	Easy	1	A	`Function Calling`	`Agentic Use`	`Execution Action`	`Tool Prompting`	OpenAI specifically categorizes the schema object passing under “Function Calling.”
27	LangGraph & Agentic AI	Lec3	Tool Calling	Which python decorator is used in LangChain to easily convert a standard Python function into a Tool?	Easy	1	C	`@langchain_tool`	`@chain`	`@tool`	`@func`	The `@tool` decorator automatically infers schema from the python function and its docstring.
28	LangGraph & Agentic AI	Lec3	Tool Calling	What makes Tavily Search specifically optimized for AI applications compared to standard generic web search APIs?	Easy	1	B	It is slower but cheaper.	It pre-formats results for LLMs, filters noise, and provides context for RAG.	It only searches Wikipedia.	It bypasses the internet using a local database.	Tavily removes clutter (HTML/Ads) and extracts clean content structured for immediate LLM context window ingestion.
29	LangGraph & Agentic AI	Lec3	Tool Calling	What is a common best practice regarding Tool Descriptions in the code?	Easy	1	A	They should be highly detailed so the LLM knows exactly when and how to call the tool.	They are ignored by the LLM, so they can be left blank.	They must be written in JSON.	They should be under 5 words to save tokens.	High-quality descriptions help the model “Reason” appropriately about when the tool is useful.
30	LangGraph & Agentic AI	Lec3	Tool Calling	What is “Tool Chaining”?	Easy	1	D	Storing tool outputs in a blockchain.	Running the same tool 100 times to check consistency.	Restricting tool execution to an administrator.	Using the output of one tool as the direct input argument for another tool recursively.	A common pattern is having one tool’s result guide the parameter execution of the next tool (like extracting a company name, then passing a stock ticker to a finance tool).
31	LangGraph & Agentic AI	Lec3	Tool Calling	How should developers securely manage API keys (like `TAVILY_API_KEY`) when building tool-calling applications?	Medium	1	B	Hardcoding them at the top of the python script.	Using Environment Variables or a Secret Management service (like Azure KeyVault).	Passing them directly inside the user prompt.	Storing them inside the `StateGraph` object.	Best practices strongly dictate loading secrets via ENV variables (e.g. `dotenv`) or cloud secret managers.
32	LangGraph & Agentic AI	Lec3	Tool Calling	When handling tool execution errors (such as network timeouts or API failures), what is the recommended fallback strategy?	Medium	1	C	Raising a fatal exception to stop the script immediately.	Silently ignoring the error and proceeding with an empty string.	Catching the exception and returning a `ToolMessage` containing the error text for the LLM.	Switching to an older language model automatically.	Returning the exception as a string in `ToolMessage` gives the LLM context to either reason about the failure, apologize to the user, or try another tool.
33	LangGraph & Agentic AI	Lec3	Tool Calling	What optimization technique can significantly reduce duplicate external API calls from tools?	Medium	1	A	Implementing a caching layer (e.g. `lru_cache` or a dictionary buffer) keyed by the tool query.	Disabling the `@tool` decorator.	Limiting the LLM to 1 iteration entirely.	Removing the system prompt.	Caching recent tool queries locally drastically saves external latency and cost for repeated inquiries.
34	LangGraph & Agentic AI	Lec3	Tool Calling	If you want to use a Custom Tool class in LangChain instead of a decorator, which base class must you inherit from?	Medium	1	D	`ToolDecorator`	`GraphNode`	`LLMChain`	`BaseTool`	Class-based tools need to inherit from `BaseTool` and override the `_run` and `_arun` methods.
35	LangGraph & Agentic AI	Lec3	Tool Calling	How does the Tavily API `search_depth="advanced"` configuration differ conceptually from standard execution?	Hard	1	C	It executes SQL queries on the backend instead.	It forces the agent to ask the user permission.	It performs a multi-step semantic search to extract comprehensive answers rather than returning simple link snippets.	It parses local PDF files instead of the web.	Advanced depth leverages an AI sub-agent during search to synthesize answers and return higher-quality textual analysis.
36	LangGraph & Agentic AI	Lec3	Tool Calling	When building an architecture where an Orchestrator routes tasks, why would you implement a specific “Web Search Agent” rather than just giving the generic tools directly to the primary assistant?	Hard	1	B	Because the primary assistant cannot accept tools format APIs.	To separate concerns: a specialized agent can execute multi-step tool queries recursively without overloading the main router’s prompt context.	Because Tavily Search restricts execution to sub-nodes by design.	Web Search agents use zero tokens.	Sub-agents handle the cognitive load of browsing, reading snippets, and re-searching autonomously, returning only polished synthesis to the main router.
37	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	What is the main structural advantage of a Hierarchical (Supervisor) multi-agent system?	Easy	1	A	A Primary Assistant coordinates user intent and cleanly routes requests to specialized sub-agents.	Every agent talks to every other agent at the same time.	It prevents the use of external APIs.	It runs on a single linear LangChain pipeline.	Supervisors manage the workflow orchestration cleanly while sub-agents handle specific deep domains.
38	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	Why would a system designer choose multi-agent architectures over a single sophisticated LLM?	Easy	1	C	Single LLMs cannot use Python code.	A single LLM always hallucinates.	It promotes specialization, modularity, parallel processing, and avoids prompt overloading.	Multi-agent systems guarantee faster latency in all scenarios.	Splitting into separate specialized models (e.g., Architect, Coder, Reviewer) improves accuracy and creates maintainable codebases.
39	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	What does a Network (Peer-to-Peer) coordination pattern imply?	Easy	1	C	Agents are executed manually by humans.	All agents must report back to a supervisor before interacting.	Agents can communicate with each other directly without central supervision.	It is a centralized routing protocol.	Unlike supervisors, peer-to-peer agents message each other directly to resolve tasks.
40	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	In a Hierarchical system, how does a Sub-Agent signal that its task is complete and it wishes to return control to the Primary Assistant?	Easy	1	D	By crashing the program.	By calling the end user via SMS.	By erasing the shared state’s message list.	By executing a “CompleteOrEscalate” tool call, signaling the workflow to pop the dialog stack.	The common pattern relies on returning a specific signal (like `pop_dialog_state`) transitioning back to the orchestrator.
41	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	In multi-agent LangGraph architectures, what prevents agents from losing the overarching conversation context?	Easy	1	B	They read the local filesystem.	They all read and append to a centralized shared `messages` list managed in the `AgenticState`.	The developer manually pastes the JSON transcript into each prompt.	They query a vector database at every step.	Shared TypedDict State containing `add_messages` tracking history across all nodes ensures alignment.
42	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	What is the purpose of the `dialog_state` stack in a hierarchical multi-agent state?	Easy	1	A	To push and pop agent identifiers corresponding to the current active agent in the conversation tree.	To log errors to a debugging console.	To translate different languages.	To count the number of LLM tokens used.	The dialog stack (`["primary", "ticket_agent"]`) acts analogously to a programming call stack, remembering which agent is currently active.
43	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	What is “Context Injection” referring to in multi-agent tool execution?	Medium	1	D	Injecting system prompts into the vector database.	Overriding the user’s internet connection.	Re-training the model mid-conversation.	Automatically supplying known session metadata (like `user_id` or `email`) into tool arguments without the LLM needing to derive them explicitly.	Context fields defined in the `AgenticState` are injected quietly into tool schemas by intermediate functions to provide precise references automatically.
44	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	How do routing functions (conditional edges) decide to shift execution from the Primary Assistant to a designated Sub-Agent?	Medium	1	C	The user types “Route” in the chat window.	A random hash evaluates to true.	By inspecting the `tool_calls` generated by the Primary Assistant and matching the `tool_name` to a subgraph node.	They execute raw SQL queries tracking agent status.	Standard routers look at the Assistant’s final `AIMessage`; if it includes `tool_calls` for a particular sub-agent, the edge routes to that corresponding node.
45	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	Why might an agentic architecture include an “Entry Node” when transitioning to a child agent?	Medium	1	B	To charge the user additional credits.	To silently append a `ToolMessage` providing the child agent with instructions, task context, and a reminder to call a return tool when done.	To block external api requests permanently.	To delete previous session checkpoints.	Entry nodes serve as a trampoline, providing localized instructions to the incoming sub-agent without confusing the Primary Assistant’s prompt.
46	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	During multi-agent fallback, what happens when a tool execution fails inside an agent’s subgraph?	Medium	1	A	A custom `create_tool_node_with_fallback` catches the exception and returns the error within a standard `ToolMessage` for the corresponding agent to review.	The `PrimaryAssistant` automatically shuts down.	The system crashes.	It switches out the open-source LLM for an OpenAI model.	A structured fallback catcher prevents silent failures or crashes and turns exceptions into conversational events the agent can rectify.
47	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	In a highly complex Competitive multi-agent arrangement, how do agents ultimately converge on a single answer?	Hard	1	C	They execute a random dice roll.	The graph hangs infinitely until restarted.	A separate Evaluator/Synthesizer agent compares the outputs of all competing agents and selects or merges the best response into the final message.	Only the agent that responds first is recorded in state.	Competitive architectures require downstream synthesis nodes that “Observe” multiple paths and judge the optimal conclusion analytically.
48	LangGraph & Agentic AI	Lec4	Multi-Agent Collab	Consider the structure: `state["dialog_state"] = update_dialog_stack(["primary", "ticket_agent"], "pop")`. What state does the graph enter next based on hierarchical stack principles?	Hard	1	B	It adds a third string to the stack.	It returns the list to `["primary"]`.	It deletes the entire stack.	It loops infinitely within `ticket_agent`.	The custom reducer pops the last active element (`ticket_agent`), gracefully restoring control to the base `primary_assistant`.
49	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	Why is a “Human-in-the-Loop” (HITL) step strongly recommended for applications performing financial transactions?	Easy	1	A	They involve irreversible critical actions that require human oversight to prevent costly AI mistakes.	It accelerates the transaction speed natively.	Models cannot do math.	HITL is an obsolete pattern replaced by GPT-4.	Financial transactions are high-stakes operations requiring human intervention and compliance audit trails before final execution.
50	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	In LangGraph, what prevents all computation from being lost when an agent pauses to wait for human input?	Easy	1	C	Writing logs to a simple text file.	LangChain’s built-in `ConversationBufferMemory`.	LangGraph’s native Checkpointing mechanism (e.g., `MemorySaver` or `SqliteSaver`) tightly coupled with `interrupt_before`/`interrupt_after`.	Caching the prompt on the client side.	Checkpointers serialize the exact state graph, allowing it to rest safely in memory or DB until resumed.
51	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	How does passing `interrupt_before=["approval_node"]` change the execution behavior of the graph?	Easy	1	B	It forces the node to timeout after 3 seconds.	It suspends execution right before the specified node executes, returning control back to the application.	It skips the node altogether.	It triggers an infinite loop of human questions.	`interrupt_before` natively halts the graph, saves state, and acts as a boundary pause expecting the app to resume it later.
52	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	What is the main drawback of using `MemorySaver` as a checkpointer in LangGraph?	Easy	1	D	It requires setting up a massive cluster.	It runs too slowly for modern models.	It writes to a file that fills up the hard drive instantly.	Checkpoints disappear completely when the python process drops or server restarts.	`MemorySaver` keeps data purely in process RAM; process death equals checkpoint death.
53	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	Which checkpointer is recommended for a scalable, production-grade distributed LangGraph service?	Easy	1	C	`MemorySaver`	`SqliteSaver`	`PostgresSaver`	`FileSaver`	`PostgresSaver` leverages robust PostgreSQL servers built for concurrent, heavy-scale transactions needed in production.
54	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	How does LangGraph distinguish parallel user conversations hitting the same graph application simultaneously?	Easy	1	B	By creating separate python processes.	By assigning each conversation a unique `thread_id` in the `RunnableConfig`.	By deleting the older users’ conversations.	By using separate API keys.	`thread_id` segregates memory namespaces per conversation perfectly.
55	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	What information does LangGraph’s `app.get_state_history(config)` feature provide?	Medium	1	A	A complete historical log of all checkpointed states, parent markers, and metadata modifications across a conversation.	Only the very first `HumanMessage` sent.	The system prompt token usage.	Live streaming characters from the LLM.	Pulling state history allows time-travel debugging and viewing the explicit step-by-step data modification over the thread’s lifespan.
56	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	Given a graph paused before a “Publishing” node, what code pattern can update the state manually, say, switching `approved: False` to `approved: True`?	Medium	1	C	`app.publish(approved=True)`	Modifying the global variables inside the python script.	Calling `app.update_state(config, {"approved": True})` before invoking the graph again.	Redefining the TypedDict.	`update_state` lets developers patch the state tree with manual human reviews before releasing the lock on the paused graph.
57	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	Why would a multi-agent framework require separate short-term Checkpointers vs explicit long-term external vector databases?	Medium	1	D	Because LangChain deprecates long-term storage natively.	Short-term databases always truncate after 1 megabyte.	To prevent open-source models from scraping data.	Checkpointers handle immediate conversational state securely per thread, while Vector stores aggregate historical knowledge and profiles persistently across unrelated sessions.	Checkpointers = Thread-scoped conversational state. VectorDB = Global user-scoped background context fetching.
58	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	How does the `SqliteSaver` schema manage nested state timelines within the same thread if the user “rewinds” to an earlier step and branches context?	Medium	1	B	It overwrites the database completely.	It creates a new `checkpoint_id` pointing back to the specific `parent_checkpoint_id`, preserving branching forks natively.	It throws a primary key error.	It switches back to `MemorySaver`.	The DB schema retains parent-child snapshot ID graphs, effectively allowing true non-destructive time travel.
59	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	If an agent architecture has a manual Node simulating an “As-Node” state update (`app.update_state(config, {"fix": 1}, as_node="human_check")`), what is the technical outcome in the graph context?	Hard	1	C	The app skips ahead 10 checkpoints automatically.	The update is discarded silently because the node was skipped.	It behaves as if the actual `human_check` node was evaluated, allowing the graph’s conditional edges mapped from `human_check` to traverse properly during resumption.	The agent loops forever.	`as_node` perfectly mocks node output, resolving edge transitions waiting for that specfic node’s signature.
60	LangGraph & Agentic AI	Lec5	Human-in-the-Loop	In a scenario where an AI is suggesting Medical treatment protocols, how might `interrupt_after` be used successfully in a LangGraph structure?	Hard	1	A	Pausing after the `Generate_Diagnosis` node, sending the raw output downstream to a UI so a Senior Doctor can review and inject corrections before the `Finalize_Report` executes.	Halting the system if the internet disconnects.	Interrupting the LLM mid-token generation.	Making the LLM stream results to a text-to-speech engine.	This allows the state to fully materialize the AI’s proposal, giving the human doctor a complete object to assess before continuing.

LLMOps and Evaluation Theory#

LLMOps and Evaluation Question Bank#

No.	Training Unit	Lecture	Training content	Question	Level	Mark	Answer	Answer Option A	Answer Option B	Answer Option C	Answer Option D	Explanation
1	Unit 1: LLMOps	Lec2	RAGAS Metrics	What does the Faithfulness metric measure in RAGAS?	Easy	1	A	The truthfulness of the generated answer compared to the retrieved context	The relevance of the answer to the original question	The accuracy of the ranking of contexts	The coverage of the retrieval process	Faithfulness checks if all statements in the answer can be supported by the retrieved context, avoiding hallucinations.
2	Unit 1: LLMOps	Lec2	RAGAS Metrics	Which LLM framework is RAGAS designed to evaluate?	Easy	1	B	Agents	RAG systems	Fine-tuned models	Traditional Search Engines	Ragas is an automated evaluation framework designed specifically for RAG systems.
3	Unit 1: LLMOps	Lec2	RAGAS Metrics	What do you need to annotate data manually when using RAGAS?	Easy	1	C	Large scale human annotations	Only expert domain knowledge	Nothing, it uses LLMs like GPT-4 to automate evaluation	Both standard Q&A pairs and ranking queries	Unlike traditional methods, Ragas uses LLMs to automate the evaluation process without needing heavy human annotations.
4	Unit 1: LLMOps	Lec2	RAGAS Metrics	Which dimension is measured by Context Precision?	Easy	1	C	Quality of generation	Semantic similarity to the user query	Accuracy of the retrieval process	Coverage of expected facts	Context Precision measures the accuracy of the retrieval process by assessing the ranking of contexts.
5	Unit 1: LLMOps	Lec2	RAGAS Metrics	What is the main purpose of Answer Relevancy?	Easy	1	D	Fact-checking the answer	Verifying truthfulness	Guaranteeing context coverage	Measuring relevance between answer and original question	It evaluates the relevance between the answer and question to confirm it addresses the problem asked.
6	Unit 1: LLMOps	Lec2	RAGAS Metrics	What value range do Ragas metrics return?	Easy	1	B	0 to 100	0 to 1	-1 to 1	1 to 5	Each metric gives a value from 0 to 1, with higher values indicating better quality.
7	Unit 1: LLMOps	Lec2	RAGAS Metrics	Which metric evaluates if relevant chunks are ranked high in retrieved contexts?	Easy	1	C	Faithfulness	Context Recall	Context Precision	Answer Relevancy	Context Precision checks if relevant chunks are ranked high in the list of retrieved contexts.
8	Unit 1: LLMOps	Lec2	RAGAS Metrics	How many main metrics are covered in the RAGAS documentation?	Easy	1	A	4	5	3	6	The four main metrics are faithfulness, answer relevancy, context precision, and context recall.
9	Unit 1: LLMOps	Lec2	RAGAS Metrics	If Context Recall is 0, what does that indicate?	Easy	1	A	Retriever failed to find necessary context	Rank 1 is an irrelevant context	LLM generated hallucination	The answer is irrelevant to the query	It indicates the retriever failed to find context containing necessary information to answer the question.
10	Unit 1: LLMOps	Lec2	RAGAS Metrics	Which two metrics evaluate the “retrieval” performance?	Easy	1	B	Faithfulness & Answer Relevancy	Context Precision & Context Recall	Answer Relevancy & Context Recall	Context Precision & Faithfulness	Context precision and context recall evaluate retrieval performance.
11	Unit 1: LLMOps	Lec2	RAGAS Metrics	Describe the calculation process for Faithfulness in Ragas.	Medium	2	A	Decompose answer to statements, verify against context, calculate ratio	Generate questions, embed them, calculate cosine similarity	Determine context relevance, calculate Precision@k, aggregate	Decompose reference answer, verify if inferences exist in retrieved context	The process is: Decomposition (claims), Verification (checked against context), and Scoring (ratio).
12	Unit 1: LLMOps	Lec2	RAGAS Metrics	How does Answer Relevancy determine its score technically?	Medium	2	C	By classifying the answer using a trained classifier	By matching keywords between answer and question	By reverse-engineering questions from answer and calculating embedding cosine similarity	By comparing the character count of answer vs question	LLM generates N questions from the given answer, converts them to embeddings, and compares cosine similarity with the original question.
13	Unit 1: LLMOps	Lec2	RAGAS Metrics	A low Context Recall score means what in terms of information availability?	Medium	2	D	The information is hallucinated	The answer has redundant information	The retrieved information is scattered	The necessary facts from the reference answer are missing in the retrieved contexts	It means the necessary information from the reference answer was not found in the retrieved contexts.
14	Unit 1: LLMOps	Lec2	RAGAS Metrics	In Context Precision calculation, what is \(v_k\)?	Medium	2	C	Velocity of retrieval	Volume of chunks	Relevance indicator at position k	Value of cosine similarity	\(v_k \in \{0, 1\}\) is the relevance indicator at position k.
15	Unit 1: LLMOps	Lec2	RAGAS Metrics	Why might an answer score high in Faithfulness but low in Answer Relevancy?	Medium	2	B	The answer is hallucinated but relevant	The answer is entirely true based on context but fails to address the user’s specific question	The retriever brought back poor context	The context precision is very low	It can be completely faithful to retrieved context, but that context (and answer) might not be what the user asked for.
16	Unit 1: LLMOps	Lec2	RAGAS Metrics	Why is Faithfulness strictly compared to retrieved context and not world knowledge?	Medium	2	A	To prevent LLM hallucinations from being counted as correct if the retriever failed	Ragas has no access to world knowledge	The LLM doesn’t know facts	World knowledge costs more tokens	RAG’s core value is grounding generation on specific private/provided context, so it measures adherence to that context only to prevent unaccounted hallucinations.
17	Unit 1: LLMOps	Lec2	RAGAS Metrics	If LLM splits an answer into 3 statements, and only 2 are verified in context, Faithfulness is?	Medium	2	B	0.5	0.67	0.33	1.0	Faithfulness relies on the ratio of correct statements: 2 out of 3 makes it ~0.67.
18	Unit 1: LLMOps	Lec2	RAGAS Metrics	Given a scenario where a user asks about Einstein’s death, but the context only contains his birth, and the LLM answers “Einstein died in 1955” using its internal knowledge. What are the RAGAS metric implications?	Hard	3	B	High Faithfulness, Low Answer Relevancy	Low Faithfulness, High Answer Relevancy	Low Faithfulness, Low Context Recall	High Context Precision, High Context Recall	It answers the user (High Relevancy), but the claim isn’t in context, making Faithfulness low.
19	Unit 1: LLMOps	Lec2	RAGAS Metrics	To improve Context Precision in a RAG pipeline, what architecture modification would you introduce?	Hard	3	C	Increase LLM temperature	Swap FAISS for ChromaDB	Add a Cross-encoder reranking step	Generate multiple answers and average them	Reranking specifically improves the order/ranking of retrieved chunks, heavily impacting Context Precision metrics.
20	Unit 1: LLMOps	Lec2	RAGAS Metrics	Detail the mathematical rationale behind using N reverse-engineered questions for calculating Answer Relevancy.	Hard	3	A	Averages out the stochastic nature of LLMs generating questions to provide a stable semantic similarity	It is required to satisfy vector dimensions	One question uses up too few tokens	N acts as a padding token for embeddings	Generating N questions and averaging their cosine similarities mitigates the variance inherent in LLM generation, ensuring a robust relevancy score.
21	Unit 2: Observability	Lec6	Observability Concepts	What is Observability in the context of LLM applications?	Easy	1	A	The ability to track flows, errors and costs of LLM apps acting as black boxes	A library for generating UI code	A vector database	The algorithm used for chunking texts	It tracks probabilistic components acting as black boxes, aiding in tracing, tracking costs, and debugging.
22	Unit 2: Observability	Lec6	LangFuse Basics	Which of these tools is known for being Open Source?	Easy	1	B	LangChain	LangFuse	LangSmith	OpenAI	LangFuse is a popular open-source tool focusing on engineering observability.
23	Unit 2: Observability	Lec6	Observability Challenges	What makes LLM applications harder to debug than traditional software?	Easy	1	C	They use more memory	They require internet connections	They involve probabilistic, non-deterministic components	They use Python	You give input, get output. LLMs act as probabilisitic black boxes.
24	Unit 2: Observability	Lec6	LangSmith Basics	Who built LangSmith?	Easy	1	B	Google	The LangChain Team	OpenAI	Meta	LangSmith is built by the LangChain team for native integration.
25	Unit 2: Observability	Lec6	LangFuse Integration	In LangFuse, what is used to automatically instrument LangChain chains code?	Easy	1	C	System.out.println	VectorEmbeddings	CallbackHandler	FAISS	LangFuse provides a CallbackHandler that automatically instruments chains.
26	Unit 2: Observability	Lec6	Prompt Management	Why should you manage prompts in a tool like LangFuse instead of hardcoding in Git?	Easy	1	A	To allow non-engineers to tweak them	Because Git is too slow	Because Git charges per token	To hide prompts from developers	It acts as a CMS for prompts so non-engineers can comfortably inspect and tweak them.
27	Unit 2: Observability	Lec6	Setup	How can you enable LangSmith auto-tracing in a LangChain project usually?	Easy	1	D	Rewrite all code to use LangSmith classes	Contact support to enable it	Import `enable_smith` module	Just set environment variables	LangSmith is magic; you often don’t need code changes, just environment variables.
28	Unit 2: Observability	Lec6	Production Best Practices	What is the recommended tracing sampling rate for Production environments?	Easy	1	C	100%	50%	1-5% of traffic	None	In production, tracing every request is noisy and expensive, so 1-5% or high importance traces are recommended.
29	Unit 2: Observability	Lec6	Privacy	How handle PII Data Privacy before logging to a cloud observability tool?	Easy	1	B	Do nothing	Run PII Masking/Redaction functions	Encrypt with simple base64	Delete all logs	Never log sensitive data; run PII Masking or use enterprise redacting features.
30	Unit 2: Observability	Lec6	Alerts	What is an example of a good alert to set up in observability?	Easy	1	A	Error Rate Spike > 10% in 5 min	“Hello World” printed	CPU temperature	Single user logged out	You should alert on things like Error Rate > 10%, Latency Spikes, or Cost Anomalies.
31	Unit 2: Observability	Lec6	LangFuse vs LangSmith	If self-hosting data privacy is an absolute requirement and budget is zero, which tool is recommended?	Medium	2	C	Weights & Biases	LangSmith	LangFuse	CloudWatch	LangFuse is Open Source (MIT) and offers easy self-hosting (Docker Compose) for free.
32	Unit 2: Observability	Lec6	LangSmith Playground	What is the “Playground: Edit and Re-run” feature in LangSmith useful for?	Medium	2	A	You can take a failed production trace, change the prompt, and test a fix immediately	Training new models	Deploying code to AWS	Chatting with other developers	It allows you to take failed real-world traces and edit prompts/parameters to instantly see if the issue resolves.
33	Unit 2: Observability	Lec6	Latency Debugging	If a RAG request takes 10 seconds, how does tracing help?	Medium	2	B	It makes the query faster	It breaks down the latency per component (e.g., Vector DB vs API completion)	It charges the user for the wait time	It cancels requests longer than 5 seconds	Tracing visualizes the execution flow, pinpointing exactly which step (Vector Search vs Generate) is the bottleneck.
34	Unit 2: Observability	Lec6	Cost Tracking	Why is Cost Tracking a critical feature in LLM Observability compared to traditional app monitoring?	Medium	2	D	Because AWS charges are cheap	Because you don’t need servers	Because LLMs don’t cost real money	Because LLM API calls are charged per-token and single runaway loops can cost hundreds of dollars quickly	API calls are expensive, requiring real-time tracking to prevent unmanaged financial overruns.
35	Unit 2: Observability	Lec6	Langchain Integration	What environment variable activates LangSmith tracing?	Medium	2	B	LANGCHAIN_DEBUG=1	LANGCHAIN_TRACING_V2=true	LANGCHAIN_LOG=all	LANGSMITH_ACTIVE=1	`export LANGCHAIN_TRACING_V2=true` activates LangSmith native tracing.
36	Unit 2: Observability	Lec6	Prompt CMS	How do you fetch a production prompt dynamically using LangFuse SDK?	Medium	2	A	Using `langfuse.get_prompt(name, version)`	Reading from a local .json file	Executing a GraphQL query to Github	Using `prompt = os.getenv('PROMPT')`	Langfuse acts as a CMS and lets you retrieve prompts using `get_prompt("name", version="production")`.
37	Unit 2: Observability	Lec6	Alerts & Best Practices	Why shouldn’t you just “stare at dashboards” for production LLM apps?	Medium	2	A	You need automated alerts (error spikes, costs) to respond fast to anomalies	Dashboards are always broken	It slows down the computer	Observability doesn’t provide dashboards	Dashboards are passive. Automated alerts are needed to actively manage sudden cost, latency, or error anomalies.
38	Unit 2: Observability	Lec6	Advanced LangChain Integration	You have a complex application utilizing standard Python code, LangChain agent loops, and custom API calls. Should you prefer LangSmith or LangFuse, and why?	Hard	3	B	LangSmith, because it supports Python natively better	LangFuse, because it is platform-agnostic and instruments cleanly across non-LangChain code too.	LangSmith, because LangChain is mandatory.	LangFuse, because it has an “Edit and Re-run” playground.	LangFuse is platform-agnostic for non-LangChain code, making it better for mixed-stack integrations, while LangSmith is highly specific and native to LangChain execution loops.
39	Unit 2: Observability	Lec6	Debugging Scenarios	In production, users report the chatbot occasionally ignores their negative feedback instructions. How would you leverage LangSmith to resolve this?	Hard	3	C	By deleting the user history and trying again	Check the VectorDB logs	Locate the failed traces in LangSmith, transition them to the Playground, adjust the system prompt, and replay to verify compliance	Re-index the FAISS database	LangSmith’s Playground allows you to take directly failed traces, manipulate the prompt, and replay the exact trace environment to find the fix.
40	Unit 2: Observability	Lec6	Data Security Architecture	Explain a robust architectural design for handling HIPAA/PII compliance while using a SaaS LLM Observability platform like LangSmith Enterprise.	Hard	3	A	Run an edge/middleware service that performs localized PII Entity masking/redaction before transmitting traces to the LangSmith API	Avoid observability tools completely	Share passwords directly via the agent	Mask PII inside the LangSmith GUI	PII must not leave the secure perimeter; redaction must happen at the application layer or middleware before data is shipped via logs/traces.