Final Project Exam: FPT Customer Chatbot - Multi-Agent AI System#
overview#
Field |
Value |
|---|---|
Course |
LangGraph and Agentic AI |
Project Name |
|
Duration |
360 minutes (6 hours) |
Passing Score |
70% |
Total Points |
100 |
Framework |
Python 3.10+, LangGraph, LangChain, Tavily API, FAISS, OpenAI |
Description#
You have been hired as an AI Engineer at FPT Software, tasked with building a Multi-Agent Customer Service Chatbot AI Core that demonstrates mastery of all concepts covered in the LangGraph and Agentic AI module.
This final project consolidates all five assignments into a single comprehensive multi-agent system:
Assignment 01: LangGraph Foundations & State Management
Assignment 02: Multi-Expert ReAct Research Agent
Assignment 03: Tool Calling & Tavily Search Integration
Assignment 04: FPT Customer Chatbot - Multi-Agent System
Assignment 05: Human-in-the-Loop & Persistence
You will build the AI Core for an FPT Customer Chatbot with hierarchical multi-agent architecture, real-time web search, human approval workflows, response caching, and persistent state management.
This exam focuses purely on the AI/LangGraph logic. For the Engineering layer (FastAPI, database, REST APIs), please refer to the Building Monolith API with FastAPI moduleβs final exam.
Objectives#
By completing this exam, you will demonstrate mastery of:
State Management: Implementing messages-centric patterns with TypedDict and add_messages reducer
ReAct Pattern: Building reasoning + acting loops with iteration control
Tool Calling: Integrating external APIs (Tavily) with parallel execution
Multi-Agent Architecture: Designing hierarchical systems with specialized agents
Human-in-the-Loop: Implementing interrupt patterns for user confirmation
Persistence: Configuring checkpointers for long-running conversations
Caching: Building vector store-based response caching with FAISS
Problem Description#
Build the AI Core for an FPT Customer Service Chatbot named fpt-customer-chatbot-ai that includes:
Agent |
Responsibilities |
|---|---|
Primary Assistant |
Routes user queries to appropriate specialized agents |
FAQ Agent |
Answers FPT policy questions using RAG with cached responses |
Ticket Agent |
Handles ticket-related conversations with HITL approval (mock tools) |
Booking Agent |
Handles booking conversations with HITL confirmation (mock tools) |
IT Support Agent |
Troubleshoots technical issues using Tavily search + caching |
The system must:
Maintain conversation context across multiple turns
Require human confirmation before sensitive operations
Cache responses for similar queries
Persist state across process restarts
Handle agent transitions gracefully with dialog stack
The Ticket and Booking agents will use mock tools that simulate database operations. The actual database integration is covered in the FastAPI module exam.
Prerequisites#
Completed all 5 module assignments (recommended)
OpenAI API key (
OPENAI_API_KEY)Tavily API key (
TAVILY_API_KEY)Python 3.10+ with virtual environment
Familiarity with Pydantic for schema validation
Technical Requirements#
Environment Setup#
Python 3.10 or higher
Required packages:
langgraph>= 0.2.0langchain>= 0.1.0langchain-openai>= 0.1.0langchain-community>= 0.1.0tavily-python>= 0.3.0faiss-cpu>= 1.7.0sentence-transformers>= 2.2.0pydantic>= 2.0.0
Mock Data Models#
For testing purposes, define the following Pydantic models (actual database integration is in FastAPI module):
Ticket Model:
Field |
Type |
Constraints |
|---|---|---|
ticket_id |
str |
Auto-generated UUID |
content |
str |
Required |
description |
str | None |
Optional |
customer_name |
str |
Required |
customer_phone |
str |
Required |
str | None |
Optional |
|
status |
TicketStatus |
Pending/InProgress/Resolved/Canceled |
created_at |
datetime |
Auto-set |
Booking Model:
Field |
Type |
Constraints |
|---|---|---|
booking_id |
str |
Auto-generated UUID |
reason |
str |
Required |
time |
datetime |
Required, must be future |
customer_name |
str |
Required |
customer_phone |
str |
Required |
str | None |
Optional |
|
note |
str | None |
Optional |
status |
BookingStatus |
Scheduled/Finished/Canceled |
Tasks#
Task 1: State Management Foundation (15 points)#
Time Allocation: 60 minutes
Build the core state management infrastructure for the multi-agent system.
Requirements:#
Define AgenticState using TypedDict with:
messages: UsingAnnotated[List[AnyMessage], add_messages]patterndialog_state: Stack for tracking agent hierarchyuser_id,email(optional): Context injection fieldsconversation_id: Session tracking
Implement dialog stack functions:
update_dialog_stack(left, right): Push/pop agent transitionspop_dialog_state(state): Return to Primary Assistant
Create context injection that auto-populates user info into tool calls
Configure MemorySaver checkpointer for initial development
Deliverables:#
state/agent_state.py- State definition with all fieldsstate/dialog_stack.py- Stack management functionsstate/context_injection.py- User context injection logic
Task 2: Specialized Agents Implementation (25 points)#
Time Allocation: 120 minutes
Implement all four specialized agents with their tools and schemas.
Requirements:#
Ticket Support Agent (8 points):
Define Pydantic schemas:
CreateTicket,TrackTicket,UpdateTicket,CancelTicketImplement mock tools that simulate CRUD operations (return success messages, store in memory dict)
Status transitions: Pending β InProgress β Resolved (or Canceled)
Add
CompleteOrEscalatetool for returning to Primary AssistantTools should accept and validate all required fields
Booking Agent (7 points):
Define Pydantic schemas with time validation (must be future)
Implement mock tools:
BookRoom,TrackBooking,UpdateBooking,CancelBookingStatus transitions: Scheduled β Finished (or Canceled)
Include
CompleteOrEscalatetool
IT Support Agent (5 points):
Integrate Tavily Search with
max_results: 5,search_depth: "advanced"Return practical troubleshooting guides from reliable sources
Include
CompleteOrEscalatetool
FAQ Agent (5 points):
Implement simple RAG for FPT policy questions
Return answers with source references
Include
CompleteOrEscalatetool
Mock tools should use an in-memory dictionary to store data for testing. This allows the AI system to function independently without database dependencies. The actual database integration will be handled in the FastAPI module exam.
Example mock implementation pattern:
# In-memory storage for testing
_ticket_store: dict[str, dict] = {}
@tool
def create_ticket(content: str, customer_name: str, customer_phone: str, ...) -> str:
"""Create a new support ticket."""
ticket_id = str(uuid.uuid4())
_ticket_store[ticket_id] = {...}
return f"Ticket created successfully with ID: {ticket_id}"
Deliverables:#
agents/ticket_agent.py- Ticket Support Agent with mock toolsagents/booking_agent.py- Booking Agent with mock toolsagents/it_support_agent.py- IT Support Agent with Tavilyagents/faq_agent.py- FAQ Agent with RAGschemas/directory with all Pydantic models
Task 3: Primary Assistant & Graph Construction (20 points)#
Time Allocation: 90 minutes
Build the Primary Assistant and construct the complete multi-agent graph.
Requirements:#
Define routing tools for Primary Assistant:
ToTicketAssistant: Route ticket-related queriesToBookingAssistant: Route booking-related queriesToITAssistant: Route technical issuesToFAQAssistant: Route policy questionsInclude user context injection in all routing tools
Implement entry nodes for agent transitions:
Create
create_entry_node(assistant_name)factory functionEntry nodes push new agent to
dialog_statestackGenerate appropriate welcome message
Build StateGraph with:
Primary Assistant as entry point
All specialized agent nodes
ToolNode for each agentβs tools
Conditional routing based on intent
Edge handling for
CompleteOrEscalate
Create
tool_node_with_fallbackfor graceful error handling
Deliverables:#
agents/primary_assistant.py- Primary Assistant with routinggraph/entry_nodes.py- Entry node factory functiongraph/builder.py- Complete graph constructiongraph/routing.py- Conditional routing logicGraph visualization PNG using
get_graph().draw_mermaid_png()
Task 4: Human-in-the-Loop Confirmation (20 points)#
Time Allocation: 90 minutes
Implement interrupt patterns for sensitive operations.
Requirements:#
Configure
interrupt_beforefor sensitive tools:All ticket creation/update/cancel operations
All booking creation/update/cancel operations
NOT for read operations (track) or search operations
Implement confirmation flow:
Detect pending tool state via
graph.get_state(config)Generate human-readable confirmation message
Parse user response: βyβ to continue, other to cancel
Create confirmation message generator:
Extract tool name and arguments from pending state
Format readable summary for user review
Include clear instructions for approval/rejection
Handle user responses:
βyβ or βyesβ: Resume execution with
app.invoke(None, config)Other: Update state to cancel operation and return message
Log all confirmation decisions
Deliverables:#
hitl/interrupt_config.py- List of sensitive toolshitl/confirmation.py- Confirmation flow logichitl/message_generator.py- Human-readable message formatting
Task 5: Response Caching with FAISS (10 points)#
Time Allocation: 60 minutes
Implement vector store-based caching for RAG and IT Support responses.
Requirements:#
Create cache_tool that:
Stores all RAG and IT Support responses in FAISS vectorstore
Indexes by query embedding using
sentence-transformersStores metadata: timestamp, query_type, source_agent
Implement cache lookup in orchestrator:
Before calling RAG/IT tools, check cache for similar queries
Use similarity threshold (0.85) to determine cache hit
Return cached response if found, otherwise proceed to tool
Add cache management:
TTL-based invalidation (24 hours)
Manual cache clear capability
Cache statistics logging (hits, misses, hit rate)
Deliverables:#
cache/faiss_cache.py- FAISS caching implementationcache/cache_manager.py- Cache management and TTL logiccache/cache_stats.py- Statistics tracking
Task 6: Persistence & Production Readiness (10 points)#
Time Allocation: 60 minutes
Configure persistent state and production-ready error handling.
Requirements:#
Replace MemorySaver with SQLiteSaver:
Configure persistent storage in
checkpoints.dbTest conversation resumption after process restart
Document the migration path to PostgresSaver
Implement thread management:
List active threads
View checkpoint history for a thread
Delete old threads (cleanup)
Add error handling and logging:
Structured logging with conversation context
Graceful error recovery for tool failures
User-friendly error messages
Deliverables:#
persistence/checkpointer.py- SQLiteSaver configurationpersistence/thread_manager.py- Thread management utilitiesutils/logging.py- Structured logging setuputils/error_handler.py- Error handling utilities
Test Scenarios#
Complete these test scenarios to demonstrate system functionality:
Scenario 1: Multi-Agent Conversation Flow#
User: "Hi, I need help with a few things"
β Primary Assistant welcomes user
User: "My laptop won't connect to WiFi"
β Routes to IT Support Agent
β Tavily search for troubleshooting
β Cache response
β Return to Primary Assistant
User: "I need to book a meeting room for tomorrow 2pm"
β Routes to Booking Agent
β Shows confirmation prompt (HITL)
β User confirms "y"
β Booking created
β Return to Primary Assistant
Scenario 2: HITL Rejection Flow#
User: "Create a support ticket for broken monitor"
β Routes to Ticket Agent
β Shows confirmation prompt
β User rejects with "no, wait"
β Operation cancelled
β Agent asks for clarification
Scenario 3: Cache Hit Flow#
User: "How do I reset my password?" (first time)
β FAQ Agent answers from RAG
β Response cached
User: "Password reset instructions?" (similar query)
β Cache hit detected (similarity > 0.85)
β Return cached response
Scenario 4: Persistence Test#
1. Start conversation, create a ticket
2. Stop the process
3. Restart with same thread_id
4. Verify conversation history retained
5. Track the created ticket
Questions to Answer#
Include written responses to these questions in ANSWERS.md:
State Management: Explain why the
add_messagesreducer is essential for multi-turn conversations. What problems would occur without it?Multi-Agent Architecture: Compare the dialog stack approach vs. flat routing. When would you choose one over the other?
Human-in-the-Loop Trade-offs: What are the UX implications of requiring confirmation for every sensitive action? How would you balance security vs. user experience?
Caching Strategy: How would you handle cache invalidation when the underlying FAQ documents are updated? Propose a solution.
Production Considerations: What additional features would you add before deploying this system to production? Consider: monitoring, scaling, security.
Submission Requirements#
Directory Structure#
fpt-customer-chatbot-ai/
βββ agents/
β βββ primary_assistant.py
β βββ ticket_agent.py
β βββ booking_agent.py
β βββ it_support_agent.py
β βββ faq_agent.py
βββ schemas/
β βββ ticket_schemas.py
β βββ booking_schemas.py
βββ state/
β βββ agent_state.py
β βββ dialog_stack.py
β βββ context_injection.py
βββ tools/
β βββ ticket_tools.py # Mock tools for ticket operations
β βββ booking_tools.py # Mock tools for booking operations
β βββ mock_store.py # In-memory storage for testing
βββ graph/
β βββ builder.py
β βββ entry_nodes.py
β βββ routing.py
βββ hitl/
β βββ interrupt_config.py
β βββ confirmation.py
β βββ message_generator.py
βββ cache/
β βββ faiss_cache.py
β βββ cache_manager.py
β βββ cache_stats.py
βββ persistence/
β βββ checkpointer.py
β βββ thread_manager.py
βββ utils/
β βββ logging.py
β βββ error_handler.py
βββ data/
β βββ fpt_policies.txt (or .json)
βββ main.py
βββ requirements.txt
βββ README.md
βββ ANSWERS.md
βββ graph_visualization.png
This AI core is designed to be integrated with the FastAPI backend from the Building Monolith API with FastAPI module. The mock tools in tools/ directory can be replaced with actual database operations when integrating.
Required Deliverables#
Complete source code following directory structure
README.mdwith:Setup instructions (environment, API keys, dependencies)
Usage examples and CLI commands
Architecture diagram or explanation
Notes on how to integrate with FastAPI backend
ANSWERS.mdwith written responses to all 5 questionsrequirements.txtwith all dependenciesgraph_visualization.png- Multi-agent graph visualizationDemo video or screenshots showing:
All four agent flows working
HITL confirmation workflow
Cache hit scenario
Persistence across restart
Submission Checklist#
All code runs without errors
All four specialized agents functional with mock tools
Primary Assistant routes correctly
HITL confirmation works for sensitive operations
Cache stores and retrieves responses
SQLiteSaver enables conversation persistence
Dialog stack tracks agent hierarchy
Context injection auto-populates user info
All test scenarios pass
Documentation is complete
Evaluation Criteria#
Criteria |
Points |
Excellent (100%) |
Good (75%) |
Needs Improvement (50%) |
|---|---|---|---|---|
State Management (Task 1) |
15 |
Perfect messages pattern, dialog stack, injection |
Working but minor issues in context handling |
Basic state only, missing stack or injection |
Specialized Agents (Task 2) |
25 |
All agents with complete tools and validation |
Most agents working, some validation missing |
Only 1-2 agents functional |
Graph Construction (Task 3) |
20 |
Complete graph with all routing and fallbacks |
Graph works but missing error handling |
Basic graph without proper routing |
Human-in-the-Loop (Task 4) |
20 |
Smooth confirmation UX with proper state handling |
HITL works but UX needs improvement |
Basic interrupt without proper messaging |
Response Caching (Task 5) |
10 |
Full caching with TTL and statistics |
Caching works but missing TTL or stats |
Basic storage without similarity search |
Persistence & Production (Task 6) |
10 |
SQLite with thread management and error handling |
Persistence works but limited management |
MemorySaver only, no persistence |
Total |
100 |
Hints#
Use
state["messages"][-1]to access the most recent messageThe
add_messagesreducer handles message deduplication automaticallyStore
dialog_stateas a list for stack operations (append/pop)
Use
ToolNode(tools).with_fallbacks([...])for graceful error handlingThe
CompleteOrEscalatetool should return a flag that routing can detectEntry nodes should push to stack, exit nodes should pop
Access pending state with
app.get_state(config).nextto see which node is pendingUse
app.update_state(config, values)to modify state before resumingConsider timeout handling for user confirmation
Use
sentence-transformers/all-MiniLM-L6-v2for consistent embeddingsStore original query and response as metadata, not just embedding
Implement cache warmup for common queries
SQLiteSaver requires context manager:
with SqliteSaver.from_conn_string(...) as saver:Thread IDs should be user-meaningful (e.g.,
user123-session1)Consider implementing session timeout (24h default)