本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。 # 多轉對話快取對於具有多轉對話的應用程式，相同的使用者訊息可能意味著不同的內容，具體取決於內容。例如，在有關 Valkey 的對話中「告訴我更多」表示與有關 Python 的對話中「告訴我更多」不同的內容。 ## 挑戰單一提示快取適用於無狀態查詢。在多迴轉對話中，您必須快取完整的對話內容，而不只是最後一則訊息： ``` # "Tell me more" means nothing without context # Conversation A: "What is Valkey?" -> "Tell me more" (about Valkey) # Conversation B: "What is Python?" -> "Tell me more" (about Python) ``` ## 策略：內容感知快取金鑰內嵌完整對話內容的摘要，而不是僅內嵌最後一個使用者訊息。如此一來，類似對話流程中的類似後續問題可以重複使用快取的答案。 ``` def build_context_string(messages: list) -> str: """Build a cacheable context string from conversation messages.""" # Use last 3 turns (6 messages: user + assistant pairs) recent = messages[-6:] parts = [] for msg in recent: role = msg["role"] content = msg["content"][:200] # Truncate long messages parts.append(f"{role}: {content}") return " | ".join(parts) ``` ## 每個使用者使用 TAG 篩選條件進行快取隔離使用 TAG 欄位，依使用者、工作階段或其他維度隔離快取對話。這可防止針對另一個使用者傳回某個使用者的快取對話： ``` # Create index with TAG field for per-user isolation valkey_client.execute_command( "FT.CREATE", "conv_cache_idx", "SCHEMA", "context_summary", "TEXT", "response", "TEXT", "user_id", "TAG", "turn_count", "NUMERIC", "embedding", "VECTOR", "HNSW", "6", "TYPE", "FLOAT32", "DIM", "1024", "DISTANCE_METRIC", "COSINE", ) ``` 使用混合式篩選搜尋 (TAG \+ KNN)： ``` def lookup_conversation_cache(messages: list, user_id: str, threshold: float = 0.12): """Search cache for similar conversation contexts, scoped to a user. Note: FT.SEARCH with COSINE distance returns a distance score where 0 = identical and 2 = opposite. A lower score means higher similarity. The threshold here is a maximum distance: only return results closer than this value. """ context = build_context_string(messages) query_vec = get_embedding(context) # Hybrid search: filter by user_id TAG + KNN on context embedding results = valkey_client.execute_command( "FT.SEARCH", "conv_cache_idx", f"@user_id:{{{user_id}}}=>[KNN 1 @embedding $query_vec]", "PARAMS", "2", "query_vec", query_vec, "DIALECT", "2", ) if results[0] > 0: fields = results[2] field_dict = {fields[j]: fields[j+1] for j in range(0, len(fields), 2)} distance = float(field_dict.get("__embedding_score", "999")) if distance < threshold: # Lower distance = more similar return {"hit": True, "response": field_dict.get("response", ""), "distance": distance} return {"hit": False} ``` **注意** TAG `@user_id:{user_123}` 篩選條件可確保使用者 A 的快取對話不會洩漏給使用者 B。混合查詢 (TAG \+ KNN) 會以單一原子操作執行：由使用者預先篩選，然後尋找最近的對話內容。 ## 快取隔離策略 | 策略 | TAG 篩選條件 | 最適合 | | --- | --- | --- | | 每位使用者 | @user\_id:{user\_123} | 個人化助理 | | 每個工作階段 | @session\_id:{sess\_abc} | 短期聊天 | | 全域（共用） | 無篩選條件 (\*) | 常見問答集機器人、常見查詢 | | 每個模型 | @model:{gpt-4} | 多模型部署 | | 每個產品 | @product\_id:{prod\_456} | 電子商務助理 |