当前位置：首页 > article >正文

crawl4ai 大模型友好格式输入爬虫框架

article 2025/3/4 1:37:00

参考：
https://github.com/unclecode/crawl4ai

底层用的微软的 playwright 爬虫架构

1、安装

# Install the package
pip install -U crawl4ai

# Run post-installation setup
crawl4ai-setup

在这里插入图片描述

2、使用

import asyncio
from crawl4ai import *

async def main():
    async with AsyncWebCrawler() as crawler:
        result = await crawler.arun(
            url="https://sj.qq.com/appdetail/com.xingin.xhs",
        )
        print(result.markdown)

if __name__ == "__main__":
    asyncio.run(main())

在这里插入图片描述

在这里插入图片描述

http://www.kler.cn/a/472050.html

相关文章：

LLM架构从基础到精通之NLP基础1

Java-基于Redisson的Redis工具类RedissonUtils

WebSocket 性能优化：从理论到实践

51单片机——中断（重点）

《空舞的巨兽》官方学习版

批量写入数据到数据库，卡顿怎么解决

根据状态修改圆锥扩散材质并实现扩散效果【Mars3d】

百度Android面试题及参考答案（下）

unity学习14：unity里的C#脚本的几个基本生命周期方法, 脚本次序order等

使用 Conda创建新的环境遇到的问题

Vue3 + Vite + Electron + Ts 项目快速创建

基于python的网络爬虫爬取天气数据及可视化分析（Matplotlib、sk-learn等，包括ppt，视频）

[项目实战2]贪吃蛇游戏

linux下绑host

jenkins入门13--pipeline

单片机串口控制

docker优雅停止容器

Linux中rsync命令使用

Android布局layout的draw简洁clipPath实现圆角矩形布局，Kotlin

『SQLite』如何使用索引来查询数据?