当前位置: 首页 > article >正文

openai agent第二弹:deepresearch原理介绍


  • 技术原理
  • 类似开源项目
    • OpenDeepResearcher
    • open-deep-research
    • ollama-deep-researcher
    • smolagents的open_deep_research
  • 参考资料

2月2日openai上线了第二个agent: deep research,具体功能类似24年11月google gemini发布的deep research。


deep research 使用端到端强化学习,训练模型在不同领域推理和复杂浏览任务的能力;这种方法的核心原则是,模型学会自主规划和执行多步骤过程以找到相关数据,包括基于实时信息进行回溯和适应的能力。此过程允许模型处理诸如浏览用户上传的文件、生成和细化图形以及引用网络来源等任务。





  1. 根据用户输入的研究主题,生成多个相关的query:
async def generate_search_queries_async(session, user_query):
    Ask the LLM to produce up to four precise search queries (in Python list format)
    based on the user’s query.
    prompt = (
        "You are an expert research assistant. Given the user's query, generate up to four distinct, "
        "precise search queries that would help gather comprehensive information on the topic. "
        "Return only a Python list of strings, for example: ['query1', 'query2', 'query3']."
    messages = [
        {"role": "system", "content": "You are a helpful and precise research assistant."},
        {"role": "user", "content": f"User Query: {user_query}\n\n{prompt}"}
    response = await call_openrouter_async(session, messages)
    if response:
            # Expect exactly a Python list (e.g., "['query1', 'query2']")
            search_queries = eval(response)
            if isinstance(search_queries, list):
                return search_queries
                print("LLM did not return a list. Response:", response)
                return []
        except Exception as e:
            print("Error parsing search queries:", e, "\nResponse:", response)
            return []
    return []

  1. 根据多个query,异步式调用搜索引擎API,获取相关网页的url或文本text;

async def perform_search_async(session, query):
    Asynchronously perform a Google search using SERPAPI for the given query.
    Returns a list of result URLs.
    params = {
        "q": query,
        "api_key": SERPAPI_API_KEY,
        "engine": "google"
        async with session.get(SERPAPI_URL, params=params) as resp:
            if resp.status == 200:
                results = await resp.json()
                if "organic_results" in results:
                    links = [item.get("link") for item in results["organic_results"] if "link" in item]
                    return links
                    print("No organic results in SERPAPI response.")
                    return []
                text = await resp.text()
                print(f"SERPAPI error: {resp.status} - {text}")
                return []
    except Exception as e:
        print("Error performing SERPAPI search:", e)
        return []
  1. 处理网页链接link:
async def process_link(session, link, user_query, search_query):
    Process a single link: fetch its content, judge its usefulness, and if useful, extract the relevant context.
    print(f"Fetching content from: {link}")
    page_text = await fetch_webpage_text_async(session, link)
    if not page_text:
        return None
    usefulness = await is_page_useful_async(session, user_query, page_text)
    print(f"Page usefulness for {link}: {usefulness}")
    if usefulness == "Yes":
        context = await extract_relevant_context_async(session, user_query, search_query, page_text)
        if context:
            print(f"Extracted context from {link} (first 200 chars): {context[:200]}")
            return context
    return None
  1. 使用llm as a judge,根据之前获取的内容,判断是否还需要补充新的query来查询内容;

async def get_new_search_queries_async(session, user_query, previous_search_queries, all_contexts):
    Based on the original query, the previously used search queries, and all the extracted contexts,
    ask the LLM whether additional search queries are needed. If yes, return a Python list of up to four queries;
    if the LLM thinks research is complete, it should return "".
    context_combined = "\n".join(all_contexts)
    prompt = (
        "You are an analytical research assistant. Based on the original query, the search queries performed so far, "
        "and the extracted contexts from webpages, determine if further research is needed. "
        "If further research is needed, provide up to four new search queries as a Python list (for example, "
        "['new query1', 'new query2']). If you believe no further research is needed, respond with exactly ."
        "\nOutput only a Python list or the token  without any additional text."
    messages = [
        {"role": "system", "content": "You are a systematic research planner."},
        {"role": "user", "content": f"User Query: {user_query}\nPrevious Search Queries: {previous_search_queries}\n\nExtracted Relevant Contexts:\n{context_combined}\n\n{prompt}"}
    response = await call_openrouter_async(session, messages)
    if response:
        cleaned = response.strip()
        if cleaned == "":
            return ""
            new_queries = eval(cleaned)
            if isinstance(new_queries, list):
                return new_queries
                print("LLM did not return a list for new search queries. Response:", response)
                return []
        except Exception as e:
            print("Error parsing new search queries:", e, "\nResponse:", response)
            return []
    return []

  1. 让llm根据之前搜集的资料,编写report:
async def generate_final_report_async(session, user_query, all_contexts):
    Generate the final comprehensive report using all gathered contexts.
    context_combined = "\n".join(all_contexts)
    prompt = (
        "You are an expert researcher and report writer. Based on the gathered contexts below and the original query, "
        "write a comprehensive, well-structured, and detailed report that addresses the query thoroughly. "
        "Include all relevant insights and conclusions without extraneous commentary."
    messages = [
        {"role": "system", "content": "You are a skilled report writer."},
        {"role": "user", "content": f"User Query: {user_query}\n\nGathered Relevant Contexts:\n{context_combined}\n\n{prompt}"}
    report = await call_openrouter_async(session, messages)
    return report


开源地址: https://github.com/nickscamara/open-deep-research







基于huggingface的smolagents开发的deep research agent;





  • Unity开发游戏使用XLua的基础
  • 【学术投稿-2025年计算机视觉研究进展与应用国际学术会议 (ACVRA 2025)】从计算机基础到HTML开发:Web开发的第一步
  • 携程Java开发面试题及参考答案 (200道-上)
  • 【OMCI实践】ONT上线过程的omci消息(三)
  • 寻迹传感器模块使用说明
  • Jupyterlab和notebook修改文件的默认存放路径的方法
  • P5524 [Ynoi2012] NOIP2015 充满了希望 Solution
  • MySQL 事件调度器(Event Scheduler)的使用
  • 在Debian 12上安装VNC服务器
  • 【mysql知识】mysql的存储过程详细说明
  • WordPressAI自动生成发布文章免费插件,SEO,定时任务,生成长尾关键词、根据网站主题内容全自动化后台生成发布文章
  • 小程序越来越智能化,作为设计师要如何进行创新设计
  • 智能化转型2.0:从“工具应用”到“价值重构”
  • Spring 核心技术解析【纯干货版】- IX:Spring 数据访问模块 Spring-Jdbc 模块精讲
  • C# OpenCV机器视觉:学生注意力监测
  • Android 整个屏幕可滑动,tab,viewpage是列表,tab不锁在顶
  • 如何在自己mac电脑上私有化部署deep seek
  • [Android] IKTV专享版
  • Meta推动虚拟现实:Facebook如何进入元宇宙时代
  • 107,【7】buuctf web [CISCN2019 华北赛区 Day2 Web1]Hack World
  • JavaScript(简称:js)
  • SQL server 创建DB Link 详解
  • 亚马逊自养号测评系统搭建的全面指南
  • (2025|ICLR,音频 LLM,蒸馏/ALLD,跨模态学习,语音质量评估,MOS)音频 LLM 可作为描述性语音质量评估器
  • 复工大吉!全面掌握淘宝API接口,助力电商业务高效重启
  • Ollama+deepseek+Docker+Open WebUI实现与AI聊天