当前位置：首页 > news >正文

ReACT Agent 实战

news 2025/7/11 8:18:22

1. Agent 概述

关于到底什么是 Agent，目前业界并没有一个统一的定义，比如 OpenAI 倾向于将 Agent 看作是大模型能力的延伸，而 LangChain 则侧重于 Agent 是 workflow 的编排。我们没必要去纠结定义，可以简单理解 Agent 并不是某一门具体的技术，而是一种设计理念： Agent 是一种基于 LLM 驱动的、能够自主感知周围环境、做出决策、采取行动达成特定目标的系统。

一个经典的 Agent 系统的整体架构如下：

可以看到， Agent 系统包含了以下的核心模块：

规划（Planning）：它负责将大目标分解成小的子目标，也可以对已有行为进行反思和自我改善。
记忆（Memory）：包括短期记忆和长期记忆，短期记忆提供上下文内的学习，长期记忆则提供长时间保留和回忆信息的能力。
工具（Tools）：工具是 Agent 具备的扩展能力，可以是一些开放 API、系统内的函数，也可以是第三方服务等等。
行动（Action）：行动是 Agent 可以执行的具体操作，通过使用 Tools 与外界环境交互，并获取执行结果。

Agent 看起来是一个非常复杂的系统，但是实际上有一种快速的实现方式，那就是 LLM + ReAct 框架。

那么什么是 ReAct 呢？

2. ReACT 核心原理

ReAct 实际上是两个单词的缩写：Reasoning + Acting，也就是推理 + 行动，它是一个引导大语言模型进行推理和行动的思维框架。在《ReAct: Synergizing Reasoning and Acting in Language Models》这篇论文中首次提出。

我们引用论文中的示例来解释一下。假设我们想要问大模型这样一个问题：

除了苹果遥控器，还有哪些设备可以控制苹果遥控器最初设计用来交互的程序?Aside from the Apple Remote, what other devices can control the program Apple Remote was originally designed to interact with?

使用 ReAct 框架，可以引导大模型进行如下的推理：

在这个例子中，大模型为了完成一个复杂任务时，首先会进行子任务拆分，且每个子任务的执行都会经过如下几个阶段：

Thought 思考：大模型根据任务进行思考和推理，制定执行计划。
Action 行动：大模型从可用的工具列表中筛选出可用的工具，执行具体的动作。
Observation 观察：动作执行完成后，由大模型观察执行结果，并判断是继续下一步动作，还是执行结束返回结果。

ReAct 的执行流程如下：

论文中的概念很好理解，但是它只是一套理论框架，具体该如何落地 ReAct 呢？下面我们就进行一个实战：实现一个 ReAct Agent 智能助手，它可以根据用户的提问进行推理，并自主调用外部工具，返回最终结果。

3. ReAct Agent 实战

3.1 定义 Tools 工具

工具本质上就是我们为大模型提供的扩展能力，它可以是一些 Open API（如 Google 搜索、高德天气等等），也可以是我们内部的一些函数，甚至是第三方的服务。

本次实战我们定义两个工具：

Serper Search：Serper 是一个基于 Google 的高性能实时搜索 API，可以根据输入的关键词检索到相关的内容。Serper API 可以开发申请，每个账号都有一定的免费额度，大家按照官网的指引进行操作即可。
Calculate：一个简单的计算器，利用 Python内置的eval()函数来解析并计算数学表达式，得到最终的计算结果。

Serper Search 代码如下：

import json
import osimport dotenv
import requestsdef search(query: str) -> str:"""搜索:param query: 搜索关键词:return: 搜索结果"""# 构造请求参数url = "https://google.serper.dev/search"payload = json.dumps({"q": query,})headers = {"X-API-KEY": os.getenv("SERPER_API_KEY"),  # 申请api_key: https://serper.dev/api-key'Content-Type': 'application/json'}# 发送请求response = requests.request("POST", url, headers=headers, data=payload).json()# 解析响应结果if response['organic'][0]:return response['organic'][0]['snippet']return "没有搜索到相关结果"

Calculate 工具定义如下：

def calculate(expression: str) -> float:"""计算:param expression: 运算表达式:return: 计算结果"""# 使用Python内置的eval函数进行表达式计算return eval(expression)

3.2 构造 Prompt

ReAct 的关键就在于 Prompt，一套完善的 Prompt 可以有效地指导 LLM 进行推理和行动，提升 LLM 执行的准确率和可解释性。

网上有很多版本的 ReAct Prompt，我参考了几个版本的实现，整理出了一个 Prompt 如下：

# ReAct Prompt定义
REACT_PROMPT = """
You run in a loop of Thought, Action, Observation, Answer.
At the end of the loop you output an Answer
Use Thought to describe your thoughts about the question you have been asked.
Use Action to run one of the actions available to you.
Observation will be the result of running those actions.
Answer will be the result of analysing the ObservationYour available actions are:calculate:
e.g. calculate: 4 * 7 / 3
Runs a calculation and returns the number - uses Python so be sure to use floating point syntax if necessarysearch:
e.g. search: Django
Returns a real info from searching SerperAPIAlways look things up on search if you have the opportunity to do so.Example session:Question: What is the capital of China?
Thought: I should look up on SerperAPI
Action: search: What is the capital of China?
PAUSE You will be called again with this:Observation: China is a country. The capital is Beijing.
Thought: I think I have found the answer
Action: Beijing.
You should then call the appropriate action and determine the answer from the resultYou then output:Answer: The capital of China is BeijingExample sessionQuestion: What is the mass of Earth times 2?
Thought: I need to find the mass of Earth on search
Action: search : mass of earth
PAUSEYou will be called again with this: Observation: mass of earth is 1,1944×10e25Thought: I need to multiply this by 2
Action: calculate: 5.972e24 * 2
PAUSEYou will be called again with this: Observation: 1,1944×10e25If you have the answer, output it as the Answer.Answer: The mass of Earth times 2 is 1,1944×10e25.Now it's your turn:
""".strip()

这段 Prompt 主要包含了三个要素：

身份设定：指定 LLM 为一个推理循环，引导 LLM 按照 Thought -> Action -> Observation -> Answer 的顺序执行。
工具定义：定义了 LLM 可以调用的工具，并给出工具的具体描述和使用方法。
参考示例：通过 One-Shot 的方式给出了一个参考示例，便于 LLM 更好地理解任务。

3.3 实现 ReAct Agent

接下来，我们基于上面的 Prompt，封装一个 ReAct Agent。

import osfrom openai import OpenAIfrom prompt import REACT_PROMPTclass ReActAgent:"""ReAct 智能体"""def __init__(self):"""构造函数"""# 初始化OpenAI客户端self.client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"),base_url=os.getenv("OPENAI_API_BASE"))# 初始化消息列表# System Prompt设置为ReAct的Promptself.messages = [{"role": "system", "content": REACT_PROMPT}]def __call__(self, prompt: str) -> str:"""执行(__call__是Python的魔法函数,可以使得类的实例向函数一样被调用):param prompt: 最新的Prompt:return: 执行结果"""# 保存消息列表self.messages.append({"role": "user", "content": prompt})# 调用模型,生成结果resp = self.client.chat.completions.create(model="gpt-4o-mini",messages=self.messages,temperature=0.1,)content = resp.choices[0].message.content# 保存结果self.messages.append({"role": "assistant", "content": content})# 返回结果return contentdef result(self) -> str:"""返回最终执行结果:return: 最终执行结果"""return self.messages[-1]["content"]

这里的逻辑比较简单：主要就是 ReAct Prompt 设置为 LLM 的 System Prompt，并使用 __call__ 这个魔术方法，对 LLM 的执行过程进行了封装，每次调用 ReActAgent 实例时，都会调用 LLM 的 API，并保存历史消息。

3.4 实现 Agent Executor

有了 Agent 之后，我们还需要一个 Agent Executor，来驱动 Agent 的执行过程。

Agent Executor 本质上就是一个循环，它根据 LLM 的生成结果，解析下一步需要执行的 Action，并进行 Tools 的调用。

代码如下：

import refrom react_agent import ReActAgent# 匹配Action的正则表达式
ACTION_REGEX = re.compile(r'^Action: (\w+): (.*)$')class AgentExecutor:"""Agent执行器"""def __init__(self, agent: ReActAgent, tools: dict[str, callable], max_epochs: int = 5):"""构造函数:param agent: Agent:param tools: 可用的工具列表:param max_epochs: 最大执行轮数,防止死循环"""self.agent = agentself.tools = toolsself.max_epochs = max_epochsdef execute(self, goal: str) -> str:"""执行Agent:param goal: 执行目标:return: 返回执行结果"""i = 0prompt = goalprint(f"[执行目标] {goal}")# 循环执行Agent# ReAct的本质就是一个 Thought->Action->Observation 的循环while i < self.max_epochs:# 更新执行轮数i += 1print(f"[第{i}轮]")# 执行Agentresult = self.agent(prompt=prompt)print(f"[中间结果] {result}")# 匹配需要调用的工具actions = [ACTION_REGEX.match(a) for a in result.split('\n') if ACTION_REGEX.match(a)]# 如果无需执行Action,则返回最终结果if len(actions) == 0:result = self.agent.result()print(f"[最终结果] {result}")return result# 解析Action及参数action, action_input = actions[0].groups()if action not in self.tools:raise Exception("Unknown action: {}: {}".format(action, action_input))print(f"[Execute Action]: {action} {action_input}")# 执行工具调用observation = self.tools[action](action_input)print(f"[Observation] {observation}")prompt = f"Observation: {observation}"return "超出最大轮数，终止执行"

这里有几个要点值得特殊说明下：

前面说过，AgentExecutor 本质上就是就是一个 Thought->Action->Observation 的循环，因此需要设置一个 max_epoch 最大轮数，避免出现死循环。
使用正则表达式，解析需要执行的 Action 及调用参数。

3.5 执行结果

到这里，核心功能就开发完成了，我们看下最终的执行结果：

import dotenvfrom agent_executor import AgentExecutor
from react_agent import ReActAgent
from tools import search, calculateif __name__ == '__main__':# 加载环境变量dotenv.load_dotenv()# 创建工具列表tools = {"search": search, "calculate": calculate, }# 创建ReAct Agentagent = ReActAgent()# 创建AgentExecutor执行器executor = AgentExecutor(agent=agent, tools=tools, max_epochs=10)# 执行result = executor.execute(goal="世界上第1高的山比第2高的山高多少米？")print(result)