TiαΊΏp tα»₯c xΓ’y dα»±ng phαΊ§n AI, LΓ m cΓ‘c endpoint theo chatgpt.

🌐 Base URL#

https://yourapp.com/backend-api/

βš™οΈ 1. Core Endpoints#

1.1 Generate Message (Send a Message)#

POST /backend-api/f/conversation Used to send a message to the assistant and receive the generated response.

🧾 Request Body#

{
  "user_id": "user_123",
  "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7",
  "messages": [
    {
      "id": "msg_001", // If not specified, backend generates a new message ID
      "role": "user",
      "content": {
        "content_type": "text",
        "parts": ["Hello! Can you explain Redis caching?"]
      }
    }
  ],
  "metadata": {
    "temperature": 0.7
  }
}

βœ… Response (Server-Sent Events Stream)#

event: delta_encoding
data: "v1"
data: {"type": "resume_conversation_token", "token": "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9...", "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7"}
data: {"type": "input_message", "input_message": {"id": "msg_001", "author": {"role": "user"}, "create_time": 1762484546.479, "content": {"content_type": "text", "parts": ["Can you explain Redis caching?"]}, "status": "finished_successfully"}, "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7"}
event: delta
data: {"o": "add", "v": {"message": {"id": "msg_002", "author": {"role": "assistant"}, "create_time": 1762484547.338, "update_time": 1762484547.817, "content": {"content_type": "text", "parts": [""]}, "status": "in_progress", "metadata": {"model_slug": "gpt-4-turbo-preview", "parent_id": "msg_001"}}, "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7"}}
data: {"type": "message_marker", "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7", "message_id": "msg_002", "marker": "user_visible_token", "event": "first"}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": "Redis"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " caching"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " stores"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " frequently"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " used"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " data"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " in"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " memory"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " for"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " fast"}]}
event: delta
data: {"v": [{"p": "/message/content/parts/0", "o": "append", "v": " access."}]}
event: delta
data: {"v": [{"p": "/message/status", "o": "replace", "v": "finished_successfully"}, {"p": "/message/end_turn", "o": "replace", "v": true}, {"p": "/message/metadata", "o": "append", "v": {"is_complete": true}}]}
data: {"type": "message_stream_complete", "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7"}
data: [DONE]

πŸ“– SSE Stream Explanation#

Event Type

Purpose

Why It Matters

event: delta_encoding

Declares the streaming protocol version (v1)

Allows frontend to handle different streaming formats

resume_conversation_token

JWT token to resume/reconnect to this conversation stream

Enables connection recovery if stream is interrupted

input_message

Echoes back the user’s message that was just sent

Confirms message received and stored in DB

event: delta (add)

Creates a new assistant message with empty content and status: "in_progress"

Initializes the response container before streaming text

message_marker

Marks when the first visible token appears

Frontend can show β€œtyping” indicator until this arrives

event: delta (append)

Streams individual text chunks (β€œRedis”, β€œ caching”, β€œ stores”…)

Creates the real-time typing effect by appending words progressively

event: delta (replace status)

Updates message status to "finished_successfully" and sets end_turn: true

Signals completion so frontend stops showing typing indicator

message_stream_complete

Confirms the entire message stream is done

Final acknowledgment that no more deltas are coming

[DONE]

Standard SSE termination signal

Closes the event stream connection

πŸ”‘ Key Concepts#

Delta Operations:

  • "o": "add" β†’ Create new object

  • "o": "append" β†’ Add text to existing content

  • "o": "replace" β†’ Update a field value

  • "p": "/message/content/parts/0" β†’ JSON path to the field being updated Why Streaming?

  1. User Experience: Shows progress immediately instead of waiting for full response

  2. Perceived Performance: Feels faster even if total time is the same

  3. Long Responses: User sees content as it generates (important for long answers)

  4. Interruptible: User can stop generation early if they got their answer

🧠 Notes#

  • If no conversation_id is provided β†’ backend creates a new one.

  • Each message (both user & assistant) is stored in the messages table.

  • The backend can stream responses for real-time UI.


1.2 List Conversations (Conversation History)#

GET /backend-api/conversations?offset=0&limit=20&order=updated Returns a paginated list of all conversations for a user.

βœ… Response#

{
  "items": [
    {
      "id": "690d5b6c-02d8-8321-a91e-65ea55b781f7",
      "title": "Redis Caching Explained",
      "last_message": "Redis caching stores frequently used data in memory...",
      "updated_at": "2025-11-07T09:30:00Z"
    },
    {
      "id": "732dd789-f512-7e01-b82e-13aa07b4e012",
      "title": "Python Decorators",
      "last_message": "A decorator wraps another function...",
      "updated_at": "2025-11-06T21:00:00Z"
    }
  ],
  "limit": 20,
  "offset": 0,
  "total": 2
}

🧠 Notes#

  • Query params (offset, limit, order, etc.) make it frontend-friendly.

  • Can be cached with Redis:

    • Key β†’ user:{user_id}:conversation_list

    • TTL β†’ 10 min


1.3 Get Conversation Detail#

GET /backend-api/conversation/{conversation_id} Returns all messages inside a specific conversation.

βœ… Response#

{
  "conversation_id": "690d5b6c-02d8-8321-a91e-65ea55b781f7",
  "messages": [
    {
      "id": "msg_001",
      "role": "user",
      "content": {
        "content_type": "text",
        "parts": ["Hello! Can you explain Redis caching?"]
      },
      "created_at": "2025-11-07T09:20:00Z"
    },
    {
      "id": "msg_002",
      "role": "assistant",
      "content": {
        "content_type": "text",
        "parts": [
          "Redis caching stores frequently used data in memory for fast access."
        ]
      },
      "created_at": "2025-11-07T09:22:00Z"
    }
  ]
}

🧠 Notes#

  • Ideal to cache full history:

    • Key β†’ conv:{conversation_id}:history

    • TTL β†’ 10 minutes


🧱 2. Database Schema (SQLAlchemy Example Structure)#

Table

Purpose

Key Fields

users

Basic user info

id, name, email , created_at , updated_at

conversations

Chat sessions

id, user_id, title, created_at , updated_at

messages

Chat content

id, conversation_id, role, content, created_at , updated_at


⚑ 3. Redis Caching Strategy#

Cache Key

Example

Description

TTL

conv:{id}:history

conv:690d5b6c-02d8-8321-a91e-65ea55b781f7:history

Cached message list

10 min

user:{id}:conversation_list

user:user_123:conversation_list

Cached user conversation list

10 min


🧩 4. Folder Layout (FastAPI-Style)#

app/
β”œβ”€β”€ main.py
β”œβ”€β”€ api/
β”‚   β”œβ”€β”€ routes/
β”‚   β”‚   β”œβ”€β”€ conversation.py    # /backend-api/f/conversation, /backend-api/conversations
β”‚   β”‚   └── message.py         # Optional
β”‚   └── dependencies.py
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ config.py
β”‚   β”œβ”€β”€ database.py
β”‚   └── redis_client.py
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ user.py
β”‚   β”œβ”€β”€ conversation.py
β”‚   └── message.py
β”œβ”€β”€ schemas/
β”‚   β”œβ”€β”€ conversation.py
β”‚   β”œβ”€β”€ message.py
β”‚   └── user.py
└── services/
    β”œβ”€β”€ ai_service.py          # LLM generation logic
    β”œβ”€β”€ cache_service.py       # Redis operations
    └── conversation_service.py # CRUD logic

🧠 5. Request Flow Summary#

  1. Frontend sends message via POST /backend-api/f/conversation β†’ Backend validates input β†’ fetches conversation β†’ calls AI model β†’ returns response.

  2. Frontend loads conversation list via GET /backend-api/conversations?... β†’ Fetch from Redis cache β†’ fallback to DB.

  3. Frontend loads conversation details via GET /backend-api/conversation/{conversation_id} β†’ Return full history (cached or DB).