当前位置：首页 > article >正文

【LangChain】存储与管理对话历史

article 2025/3/6 21:40:12

0. 代码演示

from langchain_community.chat_message_histories import SQLChatMessageHistory

def get_session_history(session_id):
    # 通过 session_id 区分对话历史，并存储在 sqlite 数据库中
    return SQLChatMessageHistory(session_id, "sqlite:///memory.db")

from langchain_core.messages import HumanMessage
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser

model = ChatOpenAI(model="gpt-4o-mini", temperature=0)

runnable = model | StrOutputParser()

runnable_with_history = RunnableWithMessageHistory(
    runnable, # 指定 runnable
    get_session_history, # 指定自定义的历史管理方法
)

runnable_with_history.invoke(
    [HumanMessage(content="你好，我叫麦酷")],
    config={"configurable": {"session_id": "wzr"}},
)

 '你好，麦酷！很高兴再次见到你。有什么想聊的或者需要帮助的呢？'

runnable_with_history.invoke(
    [HumanMessage(content="你知道我叫什么名字")],
    config={"configurable": {"session_id": "wzr"}},
)

'是的，你叫麦酷。有什么我可以帮助你的吗？'

runnable_with_history.invoke(
    [HumanMessage(content="你知道我叫什么名字")],
    config={"configurable": {"session_id": "test"}},
)

'抱歉，我无法知道你的名字。你可以告诉我你的名字，或者如果你有其他问题，我也很乐意帮助你！'

代码功能解析

这段代码实现了一个带持久化历史记忆的对话系统，通过 session_id 区分不同用户的对话历史，并存储到 SQLite 数据库中。以下是核心模块的解析：

1. 对话历史管理模块

from langchain_community.chat_message_histories import SQLChatMessageHistory

def get_session_history(session_id):
    # 每个 session_id 对应独立的数据库记录
    return SQLChatMessageHistory(
        session_id=session_id, 
        connection_string="sqlite:///memory.db" # SQLite 数据库路径
    )

核心作用: 为每个用户/会话创建独立的历史存储
技术细节:
- 使用 SQLite 数据库存储对话记录（文件名为 memory.db）
- session_id 作为主键区分不同对话（如用户ID、设备ID等）
- 实际表结构包含 id, session_id, message, timestamp 等字段
扩展性：可替换为其他存储后端（如PostgreSQL、Redis）

2. 对话链构建模块

from langchain_core.runnables.history import RunnableWithMessageHistory

model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
runnable = model | StrOutputParser()  # 基础问答链

runnable_with_history = RunnableWithMessageHistory(
    runnable=runnable,                # 原始链
    get_session_history=get_session_history, # 历史管理方法
    input_messages_key="input",       # 输入消息字段名（默认）
    history_messages_key="history"    # 历史消息字段名（默认）
)

组件连接:
关键参数:
- input_messages_key: 输入消息在上下文中的键名
- history_messages_key: 历史消息的键名（模型需支持上下文窗口）

3. 对话调用示例

response = runnable_with_history.invoke(
    [HumanMessage(content="你好，我叫麦酷")], # 当前消息
    config={"configurable": {"session_id": "wzr"}} # 指定会话
)

执行流程:
1. 根据 session_id="wzr" 从数据库加载历史消息
2. 将当前消息 "你好，我叫麦酷" 添加到历史记录
3. 组合历史消息 + 当前输入 → 发送给 GPT-4
4. 解析模型输出 → 返回最终响应
5. 将新消息对（用户输入 + 模型回复）保存到数据库

4. 数据库操作示例

假设进行三次连续对话：

调用顺序	用户输入	数据库存储内容（session_id=“wzr”）
第一次	“你好，我叫麦酷”	[Human: 你好，我叫麦酷, AI: 回复1]
第二次	“记住我的名字了吗？”	添加 [Human: 记住我的名字了吗？, AI: 回复2]
第三次	“我是谁？”	添加 [Human: 我是谁？, AI: 回复3]

模型在第三次调用时，实际接收的上下文包含前两次对话历史，因此能正确回答姓名。

5. 关键技术点

5.1 历史注入机制

# 伪代码展示实际发送给模型的内容
full_context = [
    {"role": "user", "content": "你好，我叫麦酷"},
    {"role": "assistant", "content": "回复1"},
    {"role": "user", "content": "记住我的名字了吗？"},
    {"role": "assistant", "content": "回复2"},
    {"role": "user", "content": "我是谁？"}
]
response = model.generate(full_context)

5.2 自动历史管理

自动追加：每次调用自动添加新消息到历史
上下文截断：当历史超过模型窗口时需处理（此示例未展示）

6. 优化建议

6.1 历史长度控制

from langchain.memory import ConversationBufferWindowMemory

# 仅保留最近3轮对话
memory = ConversationBufferWindowMemory(k=3)
runnable_with_history.memory = memory

6.2 自定义历史格式

def custom_history_formatter(history):
    return "\n".join([f"{msg.type}: {msg.content}" for msg in history])

chain = runnable_with_history.configure(
    history_formatter=custom_history_formatter
)

6.3 多模态历史支持

from langchain_core.messages import ImageMessage

# 支持图片消息存储
history.add_message(ImageMessage(content="path/to/image.png"))

7. 常见问题排查

现象	可能原因	解决方案
数据库无写入	文件权限问题	检查 `memory.db` 可写权限
历史消息未生效	session_id 不一致	确认每次调用使用相同 session_id
响应时间越来越长	历史消息过多未截断	添加窗口记忆或摘要记忆
中文内容存储乱码	数据库编码问题	使用 `sqlite:///memory.db?charset=utf8`

8. 典型应用场景

客服系统

# 根据用户手机号保持对话连续性
session_id = user.phone_number

教育机器人

# 为每个学生保存学习进度
session_id = f"{student_id}-{course_id}"

多设备同步

# 通过用户账户实现跨设备同步
session_id = user.account_id

该代码展示了如何快速构建具备长期记忆能力的对话系统，通过简洁的接口实现复杂的状态管理，是开发智能对话应用的基石。

通过 LCEL，还可以实现

配置运行时变量：https://python.langchain.com/v0.2/docs/how_to/configure/
故障回退：https://python.langchain.com/v0.2/docs/how_to/fallbacks
并行调用：https://python.langchain.com/v0.2/docs/how_to/parallel/
逻辑分支：https://python.langchain.com/v0.2/docs/how_to/routing/
动态创建 Chain: https://python.langchain.com/v0.2/docs/how_to/dynamic_chain/

更多例子：https://python.langchain.com/v0.2/docs/how_to/lcel_cheatsheet/

查看全文

http://www.kler.cn/a/573478.html

Python项目-基于深度学习的校园人脸识别考勤系统

HTTP/2 服务器端推送：FastAPI实现与前端集成指南

spaCy 入门：自然语言处理的高效工具

【wordpress】服务器已有LNMP环境（已运行WordPress），如何配置文档访问功能？

12、JavaEE概述

HTML学习笔记（全）

区块链共识机制深度揭秘：从PoW到PoS，谁能主宰未来？

【Keras图像处理入门：图像加载与预处理全解析】

阿里云 linux centos7 安装mysql

linux server docker 拉取镜像速度太慢或者超时的问题处理记录

Ubuntu 下 nginx-1.24.0 源码分析 - ngx_conf_read_token - 详解（1）

Camera相关配置

谷歌自研AI大模型Gemini 2.0介绍以及API调用方法

idea中使用DeepSeek让编程更加便捷

用于管理 Elasticsearch Serverless 项目的 AI Agent

Visual Studio Code (VSCode) 使用 GDB 进行调试

【前端】【vue-i8n】【element】Element 框架国际化配置指南：从 element-ui 到 element-plus

Diffusion模型中时间t嵌入的方法

MapReduce 深度解析：原理与案例实战

7.RabbitMQ延时交换机