《使用 LangChain 进行大模型应用开发》学习笔记(五)
前言
本文是 Harrison Chase (LangChain 创建者)和吴恩达(Andrew Ng)的视频课程《LangChain for LLM Application Development》(使用 LangChain 进行大模型应用开发)的学习笔记。由于原课程为全英文视频课程,国内访问较慢,同时我整理和替换了部分内容以便于国内学习。阅读本文可快速学习课程内容。
课程介绍
本课程介绍了强大且易于扩展的 LangChain 框架,LangChain 框架是一款用于开发大语言模型(LLM)应用的开源框架,其使用提示词、记忆、链、代理等简化了大语言模型应用的开发工作。由于 LangChain 仍处于快速发展期,部分 API 还不稳定,课程中的部分代码已过时,我使用了目前最新的 v0.2 版本进行讲解,所有代码均可在 v0.2 版本下执行。另外,课程使用的 OpenAI 在国内难以访问,我替换为国内的 Kimi 大模型及开源自建的 Ollama,对于学习没有影响。
参考这篇文章来获取 Kimi 的 API 令牌。
参考这篇文章来用 Ollama 部署自己的大模型。
课程分为五个部分:
- 第一部分
- 第二部分
- 第三部分
- 第四部分
- 第五部分
课程链接
第五部分
代理(Agent)
大语言模型往往被认为是擅长回答问题,但它也擅长作为推理引擎。你可以给它提供一系列的背景信息,大语言模型能从中学习新的知识并回答用户的疑问或者指导下一步的行动。LangChain 的代理框架可以帮助大家做这些。Agent 可以理解为是一个智能代理,它的任务是接收用户的需求(用户输入)和分析当前的情境(背景数据),然后从它的工具箱中选择最合适的工具来执行操作,最后返回给用户结果。
参考文献:ReAct: Synergizing Reasoning and Acting in Language Models
内置工具
接下来我们使用 LangChain 的内置工具来做一个代理的例子。
首先引入依赖。
请先安装 langgraph:pip install langgraph
from langchain_openai import ChatOpenAI
from langchain_community.tools.wikipedia.tool import WikipediaQueryRun
from langchain_community.utilities.wikipedia import WikipediaAPIWrapper
from langchain_community.tools.file_management import ListDirectoryTool
from langchain_core.messages import HumanMessage
from langgraph.prebuilt import create_react_agent
我们接着使用文件夹查找工具,来查找我们目录下的特定文件。
# 使用 Kimi 大模型
base_url = 'https://api.moonshot.cn/v1'
api_key = 'sk-xxx' # 替换自己的 key
llm_model = 'moonshot-v1-8k'
llm = ChatOpenAI(temperature=0.3, model=llm_model, base_url=base_url, api_key=api_key)
# 创建工具
list_dir = ListDirectoryTool()
tools = [list_dir]
# 创建代理
agent_executor = create_react_agent(llm, tools)
# 调用代理
response = agent_executor.invoke({"messages": [HumanMessage(content="请找出当前文件夹下的 Python 文件")]})
# 输出最后的结果
print(response['messages'][-1])
可能会得到类似这样的输出(每个人当前的目录下文件不同,结果就不同)。
content='当前文件夹下包含以下 Python 文件:\n\n- l1.py\n- l4.py\n- l5.py\n- l7.py\n\n这些文件可以用于进一步的分析或操作。' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 39, 'prompt_tokens': 126, 'total_tokens': 165}, 'model_name': 'moonshot-v1-8k', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-466a9016-7055-4948-b286-8311c2667674-0' usage_metadata={'input_tokens': 126, 'output_tokens': 39, 'total_tokens': 165}
上面 content 的内容就是我们要的结果。我的当前目录下确实只有这几个 Python 文件。
当前文件夹下包含以下 Python 文件:
- l1.py
- l4.py
- l5.py
- l7.py
这些文件可以用于进一步的分析或操作。
我们可以打开调试模式,看看代理究竟做了什么。
import langchain
langchain.debug = True
输出结果如下。
[chain/start] [chain:LangGraph] Entering Chain run with input:
[inputs]
[chain/start] [chain:LangGraph > chain:__start__] Entering Chain run with input:
[inputs]
[chain/end] [chain:LangGraph > chain:__start__] [0ms] Exiting Chain run with output:
[outputs]
[chain/start] [chain:LangGraph > chain:agent] Entering Chain run with input:
[inputs]
[chain/start] [chain:LangGraph > chain:agent > chain:call_model] Entering Chain run with input:
[inputs]
[chain/start] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence] Entering Chain run with input:
[inputs]
[chain/start] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence > chain:StateModifier] Entering Chain run with input:
[inputs]
[chain/end] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence > chain:StateModifier] [1ms] Exiting Chain run with output:
[outputs]
[llm/start] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence > llm:ChatOpenAI] Entering LLM run with input:
{
"prompts": [
"Human: 请找出当前文件夹下的 Python 文件"
]
}
[llm/end] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence > llm:ChatOpenAI] [1.07s] Exiting LLM run with output:
{
"generations": [
[
{
"text": "",
"generation_info": {
"finish_reason": "tool_calls",
"logprobs": null
},
"type": "ChatGeneration",
"message": {
"lc": 1,
"type": "constructor",
"id": [
"langchain",
"schema",
"messages",
"AIMessage"
],
"kwargs": {
"content": "",
"additional_kwargs": {
"tool_calls": [
{
"id": "list_directory:0",
"function": {
"arguments": "{}",
"name": "list_directory"
},
"type": "function",
"index": 0
}
],
"refusal": null
},
"response_metadata": {
"token_usage": {
"completion_tokens": 11,
"prompt_tokens": 65,
"total_tokens": 76
},
"model_name": "moonshot-v1-8k",
"system_fingerprint": null,
"finish_reason": "tool_calls",
"logprobs": null
},
"type": "ai",
"id": "run-f5ea10c4-825c-4d35-b16a-8ad2d693f391-0",
"tool_calls": [
{
"name": "list_directory",
"args": {},
"id": "list_directory:0",
"type": "tool_call"
}
],
"usage_metadata": {
"input_tokens": 65,
"output_tokens": 11,
"total_tokens": 76
},
"invalid_tool_calls": []
}
}
}
]
],
"llm_output": {
"token_usage": {
"completion_tokens": 11,
"prompt_tokens": 65,
"total_tokens": 76
},
"model_name": "moonshot-v1-8k",
"system_fingerprint": null
},
"run": null
}
[chain/end] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence] [1.07s] Exiting Chain run with output:
[outputs]
[chain/end] [chain:LangGraph > chain:agent > chain:call_model] [1.08s] Exiting Chain run with output:
[outputs]
[chain/start] [chain:LangGraph > chain:agent > chain:ChannelWrite<agent,messages>] Entering Chain run with input:
[inputs]
[chain/end] [chain:LangGraph > chain:agent > chain:ChannelWrite<agent,messages>] [0ms] Exiting Chain run with output:
[outputs]
[chain/start] [chain:LangGraph > chain:agent > chain:should_continue] Entering Chain run with input:
[inputs]
[chain/end] [chain:LangGraph > chain:agent > chain:should_continue] [0ms] Exiting Chain run with output:
{
"output": "continue"
}
[chain/end] [chain:LangGraph > chain:agent] [1.08s] Exiting Chain run with output:
[outputs]
[chain/start] [chain:LangGraph > chain:tools] Entering Chain run with input:
[inputs]
[tool/start] [chain:LangGraph > chain:tools > tool:list_directory] Entering Tool run with input:
"{}"
[tool/end] [chain:LangGraph > chain:tools > tool:list_directory] [1ms] Exiting Tool run with output:
"content='l7.py\npyproject.toml\ndata.csv\nl1.py\npoetry.lock\nl5.py\nl4.py\nproduct.csv\n.idea' name='list_directory' tool_call_id='list_directory:0'"
[chain/start] [chain:LangGraph > chain:tools > chain:ChannelWrite<tools,messages>] Entering Chain run with input:
[inputs]
[chain/end] [chain:LangGraph > chain:tools > chain:ChannelWrite<tools,messages>] [0ms] Exiting Chain run with output:
[outputs]
[chain/end] [chain:LangGraph > chain:tools] [2ms] Exiting Chain run with output:
[outputs]
[chain/start] [chain:LangGraph > chain:agent] Entering Chain run with input:
[inputs]
[chain/start] [chain:LangGraph > chain:agent > chain:call_model] Entering Chain run with input:
[inputs]
[chain/start] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence] Entering Chain run with input:
[inputs]
[chain/start] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence > chain:StateModifier] Entering Chain run with input:
[inputs]
[chain/end] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence > chain:StateModifier] [0ms] Exiting Chain run with output:
[outputs]
[llm/start] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence > llm:ChatOpenAI] Entering LLM run with input:
{
"prompts": [
"Human: 请找出当前文件夹下的 Python 文件\nAI: \nTool: l7.py\npyproject.toml\ndata.csv\nl1.py\npoetry.lock\nl5.py\nl4.py\nproduct.csv\n.idea"
]
}
[llm/end] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence > llm:ChatOpenAI] [2.70s] Exiting LLM run with output:
{
"generations": [
[
{
"text": "当前文件夹下包含以下 Python 文件:\n\n- l1.py\n- l4.py\n- l5.py\n- l7.py\n\n这些文件可以用于进一步的分析或操作。",
"generation_info": {
"finish_reason": "stop",
"logprobs": null
},
"type": "ChatGeneration",
"message": {
"lc": 1,
"type": "constructor",
"id": [
"langchain",
"schema",
"messages",
"AIMessage"
],
"kwargs": {
"content": "当前文件夹下包含以下 Python 文件:\n\n- l1.py\n- l4.py\n- l5.py\n- l7.py\n\n这些文件可以用于进一步的分析或操作。",
"additional_kwargs": {
"refusal": null
},
"response_metadata": {
"token_usage": {
"completion_tokens": 39,
"prompt_tokens": 126,
"total_tokens": 165
},
"model_name": "moonshot-v1-8k",
"system_fingerprint": null,
"finish_reason": "stop",
"logprobs": null
},
"type": "ai",
"id": "run-466a9016-7055-4948-b286-8311c2667674-0",
"usage_metadata": {
"input_tokens": 126,
"output_tokens": 39,
"total_tokens": 165
},
"tool_calls": [],
"invalid_tool_calls": []
}
}
}
]
],
"llm_output": {
"token_usage": {
"completion_tokens": 39,
"prompt_tokens": 126,
"total_tokens": 165
},
"model_name": "moonshot-v1-8k",
"system_fingerprint": null
},
"run": null
}
[chain/end] [chain:LangGraph > chain:agent > chain:call_model > chain:RunnableSequence] [2.71s] Exiting Chain run with output:
[outputs]
[chain/end] [chain:LangGraph > chain:agent > chain:call_model] [2.71s] Exiting Chain run with output:
[outputs]
[chain/start] [chain:LangGraph > chain:agent > chain:ChannelWrite<agent,messages>] Entering Chain run with input:
[inputs]
[chain/end] [chain:LangGraph > chain:agent > chain:ChannelWrite<agent,messages>] [0ms] Exiting Chain run with output:
[outputs]
[chain/start] [chain:LangGraph > chain:agent > chain:should_continue] Entering Chain run with input:
[inputs]
[chain/end] [chain:LangGraph > chain:agent > chain:should_continue] [0ms] Exiting Chain run with output:
{
"output": "end"
}
[chain/end] [chain:LangGraph > chain:agent] [2.71s] Exiting Chain run with output:
[outputs]
[chain/end] [chain:LangGraph] [3.79s] Exiting Chain run with output:
[outputs]
从上面可以看到,LangChain 构建了多条链,使用工具获取了当前目录下的所有文件,并作为背景信息提交给 LLM,找出其中的 Python 文件。
上面使用了 ListDirectoryTool 工具,LangChain 还提供了许多内置的工具,例如本地系统操作、搜索引擎整合、网络服务接口整合等等,内置工具可以在 langchain_community.tools
模块中查看。当然我们也可以编写自定义的工具,参考官方文档。
总结
在这个简短的课程中,我们看到了使用 LangChain 来构建多种应用,如基于内部知识库的问答、使用外部工具等。使用 LangChain 极大地加速了这些应用的开发。但这仅仅只是一个开始,我们还可以使用使用 LLM 强大的能力来构建 SQL 查询数据库,与外部系统 APIs 交互等等,希望大家可以继续深入了解和使用 LangChain,LangChain 的社区也在快速地发展,大家可以多多关注。感谢参与本课程的学习。
完结