📖 3 min read · By ['[“HoangNM105”]']

Practice#

🌐 Base URL#

https://yourapp.com/backend-api/

1. Redis Caching Strategy#

The goal of this practice is to integrate Redis caching into the Chatbot API to improve performance and reduce database load.

Target Endpoints to Cache#

Endpoint	Cache Strategy	Cache Key	TTL
Get Conversation Detail	Cache-Aside	`conv:{id}:history`	10 min
List Conversations	Cache-Aside	`user:{id}:conversation_list`	10 min

1.1 Cache Conversation History#

GET /backend-api/conversation/{conversation_id} Retrieve message history. Try to get from Redis first; if missing, fetch from DB and set to Redis.

Implementation Logic#

Check Cache: GET conv:{conversation_id}:history
Cache Hit: Return JSON immediately (Latency < 5ms).
Cache Miss:
- Query Database for messages.
- Write to Cache: SETEX conv:{conversation_id}:history 600 <json_data>
  - Note: We set a 600s TTL. After 10 minutes, this data is deleted to free up RAM. On cloud providers (e.g., AWS ElastiCache), saving RAM prevents the need to scale up to expensive larger nodes.
- Return Data.

Cache Key Structure#

Key: conv:690d5b6c-02d8-8321-a91e-65ea55b781f7:history
Value: [JSON String of Messages]
TTL: 600 seconds

1.2 Cache User Conversation List#

GET /backend-api/conversations?offset=0&limit=20 List all conversations. This query can be heavy on the DB if the user has many chats.

Implementation Logic#

Check Cache: GET user:{user_id}:conversation_list
Cache Hit: Return cached list.
Cache Miss:
- Query Database (SELECT * FROM conversations WHERE user_id = …).
- Write to Cache: SETEX user:{user_id}:conversation_list 600 <json_data>
- Return Data.
Invalidation Strategy:
- When a NEW conversation is created (POST /backend-api/f/conversation), you must DELETE this cache key so the list updates immediately.
- Command: DEL user:{user_id}:conversation_list

2. Advanced: Caching for LLM Context#

POST /backend-api/f/conversation Chatting with AI requires sending previous context.

Optimization Challenge#

Instead of fetching the full history from PostgreSQL for every message sent:

Store Context in Redis List:
- Use RPUSH to append new user/assistant messages to chat:{id}:context.
Limit Context Window:
- Use LTRIM to keep only the last 20 messages.
- This ensures we don’t exceed the LLM’s token limit and keeps Redis memory usage low.

# Pseudo-code for Context Caching
def add_message_to_context(conversation_id, message):
    key = f"chat:{conversation_id}:context"
    redis.rpush(key, json.dumps(message))
    redis.ltrim(key, -20, -1) # Keep last 20
    redis.expire(key, 86400)  # Expires in 24h

3. Updated Folder Layout (FastAPI-Style)#

Add the redis_client.py and cache_service.py to your project structure.

app/
├── main.py
├── api/
│   ├── routes/
│   │   ├── conversation.py    # Modified to use cache_service
│   │   └── message.py
│   └── dependencies.py
├── core/
│   ├── config.py
│   ├── database.py
│   └── redis_client.py        # logic: redis.Redis(host=...)
├── models/
│   ├── user.py
│   ├── conversation.py
│   └── message.py
├── schemas/
│   ├── conversation.py
│   └── user.py
└── services/
    ├── ai_service.py
    ├── cache_service.py       # generic get/set/delete logic
    └── conversation_service.py # Helper: get_conversation_with_cache()

4. Request Flow with Caching#

Frontend sends GET /conversation/{id}.
Backend calls CacheService.get(f"conv:{id}:history").
- Found? Return immediately.
- Not Found? Fetch DB -> CacheService.set(..., ttl=600) -> Return.
Frontend sends POST /conversation (New Message).
- Backend saves message to DB.
- Backend invalidates cache: CacheService.delete(f"conv:{id}:history").
- (Optional) Backend updates Redis List context.