Practice#
π Base URL#
https://yourapp.com/backend-api/
1. Redis Caching Strategy#
The goal of this practice is to integrate Redis caching into the Chatbot API to improve performance and reduce database load.
Target Endpoints to Cache#
Endpoint |
Cache Strategy |
Cache Key |
TTL |
|---|---|---|---|
Get Conversation Detail |
Cache-Aside |
|
10 min |
List Conversations |
Cache-Aside |
|
10 min |
1.1 Cache Conversation History#
GET
/backend-api/conversation/{conversation_id}Retrieve message history. Try to get from Redis first; if missing, fetch from DB and set to Redis.
Implementation Logic#
Check Cache:
GET conv:{conversation_id}:historyCache Hit: Return JSON immediately (Latency < 5ms).
Cache Miss:
Query Database for messages.
Write to Cache:
SETEX conv:{conversation_id}:history 600 <json_data>Note: We set a 600s TTL. After 10 minutes, this data is deleted to free up RAM. On cloud providers (e.g., AWS ElastiCache), saving RAM prevents the need to scale up to expensive larger nodes.
Return Data.
Cache Key Structure#
Key: conv:690d5b6c-02d8-8321-a91e-65ea55b781f7:history
Value: [JSON String of Messages]
TTL: 600 seconds
1.2 Cache User Conversation List#
GET
/backend-api/conversations?offset=0&limit=20List all conversations. This query can be heavy on the DB if the user has many chats.
Implementation Logic#
Check Cache:
GET user:{user_id}:conversation_listCache Hit: Return cached list.
Cache Miss:
Query Database (SELECT * FROM conversations WHERE user_id = β¦).
Write to Cache:
SETEX user:{user_id}:conversation_list 600 <json_data>Return Data.
Invalidation Strategy:
When a NEW conversation is created (
POST /backend-api/f/conversation), you must DELETE this cache key so the list updates immediately.Command:
DEL user:{user_id}:conversation_list
2. Advanced: Caching for LLM Context#
POST
/backend-api/f/conversationChatting with AI requires sending previous context.
Optimization Challenge#
Instead of fetching the full history from PostgreSQL for every message sent:
Store Context in Redis List:
Use
RPUSHto append new user/assistant messages tochat:{id}:context.
Limit Context Window:
Use
LTRIMto keep only the last 20 messages.This ensures we donβt exceed the LLMβs token limit and keeps Redis memory usage low.
# Pseudo-code for Context Caching
def add_message_to_context(conversation_id, message):
key = f"chat:{conversation_id}:context"
redis.rpush(key, json.dumps(message))
redis.ltrim(key, -20, -1) # Keep last 20
redis.expire(key, 86400) # Expires in 24h
3. Updated Folder Layout (FastAPI-Style)#
Add the redis_client.py and cache_service.py to your project structure.
app/
βββ main.py
βββ api/
β βββ routes/
β β βββ conversation.py # Modified to use cache_service
β β βββ message.py
β βββ dependencies.py
βββ core/
β βββ config.py
β βββ database.py
β βββ redis_client.py # logic: redis.Redis(host=...)
βββ models/
β βββ user.py
β βββ conversation.py
β βββ message.py
βββ schemas/
β βββ conversation.py
β βββ user.py
βββ services/
βββ ai_service.py
βββ cache_service.py # generic get/set/delete logic
βββ conversation_service.py # Helper: get_conversation_with_cache()
4. Request Flow with Caching#
Frontend sends
GET /conversation/{id}.Backend calls
CacheService.get(f"conv:{id}:history").Found? Return immediately.
Not Found? Fetch DB ->
CacheService.set(..., ttl=600)-> Return.
Frontend sends
POST /conversation(New Message).Backend saves message to DB.
Backend invalidates cache:
CacheService.delete(f"conv:{id}:history").(Optional) Backend updates Redis List context.