TiαΊΏp tα»₯c xΓ’y dα»±ng phαΊ§n AI, LΓ m cΓ‘c endpoint theo chatgpt.
π Base URL#
https://yourapp.com/backend-api/
βοΈ 1. Core Endpoints#
1.1 Generate Message (Send a Message)#
POST
/backend-api/f/conversationUsed to send a message to the assistant and receive the generated response.
π§Ύ Request Body#
{
"user_id": "user_123",
"conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7",
"messages": [
{
"id": "msg_001", // If not specified, backend generates a new message ID
"role": "user",
"content": {
"content_type": "text",
"parts": ["Hello! Can you explain Redis caching?"]
}
}
],
"metadata": {
"temperature": 0.7
}
}
β Response (Server-Sent Events Stream)#
event: delta_encoding
data: "v1"
data: {"type": "resume_conversation_token", "token": "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9...", "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7"}
data: {"type": "input_message", "input_message": {"id": "msg_001", "author": {"role": "user"}, "create_time": 1762484546.479, "content": {"content_type": "text", "parts": ["Can you explain Redis caching?"]}, "status": "finished_successfully"}, "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7"}
event: delta
data: {"o": "add", "v": {"message": {"id": "msg_002", "author": {"role": "assistant"}, "create_time": 1762484547.338, "update_time": 1762484547.817, "content": {"content_type": "text", "parts": [""]}, "status": "in_progress", "metadata": {"model_slug": "gpt-4-turbo-preview", "parent_id": "msg_001"}}, "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7"}}
data: {"type": "message_marker", "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7", "message_id": "msg_002", "marker": "user_visible_token", "event": "first"}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": "Redis"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " caching"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " stores"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " frequently"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " used"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " data"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " in"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " memory"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " for"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " fast"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " access."}]}
event: delta
data: {"v": [{"p": "/message/status", "o": "replace", "v": "finished_successfully"}, {"p": "/message/end_turn", "o": "replace", "v": true}, {"p": "/message/metadata", "o": "append", "v": {"is_complete": true}}]}
data: {"type": "message_stream_complete", "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7"}
data: [DONE]
π SSE Stream Explanation#
Event Type |
Purpose |
Why It Matters |
|---|---|---|
|
Declares the streaming protocol version ( |
Allows frontend to handle different streaming formats |
|
JWT token to resume/reconnect to this conversation stream |
Enables connection recovery if stream is interrupted |
|
Echoes back the userβs message that was just sent |
Confirms message received and stored in DB |
|
Creates a new assistant message with empty content and |
Initializes the response container before streaming text |
|
Marks when the first visible token appears |
Frontend can show βtypingβ indicator until this arrives |
|
Streams individual text chunks (βRedisβ, β cachingβ, β storesββ¦) |
Creates the real-time typing effect by appending words progressively |
|
Updates message status to |
Signals completion so frontend stops showing typing indicator |
|
Confirms the entire message stream is done |
Final acknowledgment that no more deltas are coming |
|
Standard SSE termination signal |
Closes the event stream connection |
π Key Concepts#
Delta Operations:
"o": "add"β Create new object"o": "append"β Add text to existing content"o": "replace"β Update a field value"p": "/message/content/parts/0"β JSON path to the field being updated Why Streaming?
User Experience: Shows progress immediately instead of waiting for full response
Perceived Performance: Feels faster even if total time is the same
Long Responses: User sees content as it generates (important for long answers)
Interruptible: User can stop generation early if they got their answer
π§ Notes#
If no
conversation_idis provided β backend creates a new one.Each message (both user & assistant) is stored in the messages table.
The backend can stream responses for real-time UI.
1.2 List Conversations (Conversation History)#
GET
/backend-api/conversations?offset=0&limit=20&order=updatedReturns a paginated list of all conversations for a user.
β Response#
{
"items": [
{
"id": "690d5b6c-02d8-8321-a91e-65ea55b781f7",
"title": "Redis Caching Explained",
"last_message": "Redis caching stores frequently used data in memory...",
"updated_at": "2025-11-07T09:30:00Z"
},
{
"id": "732dd789-f512-7e01-b82e-13aa07b4e012",
"title": "Python Decorators",
"last_message": "A decorator wraps another function...",
"updated_at": "2025-11-06T21:00:00Z"
}
],
"limit": 20,
"offset": 0,
"total": 2
}
π§ Notes#
Query params (
offset,limit,order, etc.) make it frontend-friendly.Can be cached with Redis:
Key β
user:{user_id}:conversation_listTTL β 10 min
1.3 Get Conversation Detail#
GET
/backend-api/conversation/{conversation_id}Returns all messages inside a specific conversation.
β Response#
{
"conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7",
"messages": [
{
"id": "msg_001",
"role": "user",
"content": {
"content_type": "text",
"parts": ["Hello! Can you explain Redis caching?"]
},
"created_at": "2025-11-07T09:20:00Z"
},
{
"id": "msg_002",
"role": "assistant",
"content": {
"content_type": "text",
"parts": [
"Redis caching stores frequently used data in memory for fast access."
]
},
"created_at": "2025-11-07T09:22:00Z"
}
]
}
π§ Notes#
Ideal to cache full history:
Key β
conv:{conversation_id}:historyTTL β 10 minutes
π§± 2. Database Schema (SQLAlchemy Example Structure)#
Table |
Purpose |
Key Fields |
|---|---|---|
users |
Basic user info |
|
conversations |
Chat sessions |
|
messages |
Chat content |
|
β‘ 3. Redis Caching Strategy#
Cache Key |
Example |
Description |
TTL |
|---|---|---|---|
|
|
Cached message list |
10 min |
|
|
Cached user conversation list |
10 min |
π§© 4. Folder Layout (FastAPI-Style)#
app/
βββ main.py
βββ api/
β βββ routes/
β β βββ conversation.py # /backend-api/f/conversation, /backend-api/conversations
β β βββ message.py # Optional
β βββ dependencies.py
βββ core/
β βββ config.py
β βββ database.py
β βββ redis_client.py
βββ models/
β βββ user.py
β βββ conversation.py
β βββ message.py
βββ schemas/
β βββ conversation.py
β βββ message.py
β βββ user.py
βββ services/
βββ ai_service.py # LLM generation logic
βββ cache_service.py # Redis operations
βββ conversation_service.py # CRUD logic
π§ 5. Request Flow Summary#
Frontend sends message via
POST /backend-api/f/conversationβ Backend validates input β fetches conversation β calls AI model β returns response.Frontend loads conversation list via
GET /backend-api/conversations?...β Fetch from Redis cache β fallback to DB.Frontend loads conversation details via
GET /backend-api/conversation/{conversation_id}β Return full history (cached or DB).