当前位置: 首页 > article >正文

Text2Video Huggingface Pipeline 文生视频接口和文生视频论文API

1. 背景

文生视频是AI领域热点,国内外有非常多的优秀产品如Runway AI、Pika AI、可灵King AI、通义千问、智谱的文生视频模型等等。很多文生视频的大模型都是基于Huggingface的 diffusers的python包来开发。为了方便调用,也尝试了使用 PyPI的text2video的python库的Wrapper类进行调用,下面会给大家介绍一下Huggingface Text to Video Pipeline的调用方式
以及使用通用的text2video的python库调用方式。


2. Huggingface Text to Video Pipeline 代码

地址: (https://huggingface.co/docs/diffusers/api/pipelines/text_to_video)

    ## code for huggingface diffusion pipeline

    import torch
    from diffusers import DiffusionPipeline
    from diffusers.utils import export_to_video

    pipe = DiffusionPipeline.from_pretrained("damo-vilab/text-to-video-ms-1.7b", torch_dtype=torch.float16, variant="fp16")
    pipe = pipe.to("cuda")

    prompt = "Spiderman is surfing"
    video_frames = pipe(prompt).frames[0]
    video_path = export_to_video(video_frames)
    video_path


3. 使用Python的包Text2Video来下载最新的文本生成领域论文。

3.1 安装 pip3的 text2video的包

 pip install text2video


3.2. 使用现有接口从 arxiv程序化下载最新文生视频的论文

定义输入接口,我们使用的是查询 ArxivPaper的API,需要传入 api_name 字段。同时可以设置 查询接口的 额外属性,包含拓展参数有。

字段|默认值|含义

start| 0|开始entry个数
max_results| 10 | 结束entry个数
sortBy| lastUpdatedDate| 日期字段
sortOrder| descending| 升序或者降序

调用python的 text2video包下载最新发布在 Arxiv论文信息

import text2video as t2v
import json 

input_dict = {"text": "Text to Video"}

res = t2v.api(input_dict, model=None, api_name="ArxivPaperAPI", start=0, max_results = 3)
paper_list = json.loads(res["text"])
print ("###### Text to Image Recent Paper List:")
for (i, paper_json) in enumerate(paper_list):
    print ("|" + paper_json["id"] + "|" + paper_json["title"].replace("\n", "") + "|" + paper_json["updated"] )


输出结果

###### Text to Image Recent Paper List:
|http://arxiv.org/abs/2410.08211v1|LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts|2024-10-10T17:59:59Z
|http://arxiv.org/abs/2410.08210v1|PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point  Supervised Oriented Object Detection|2024-10-10T17:59:56Z
|http://arxiv.org/abs/2410.08209v1|Emerging Pixel Grounding in Large Multimodal Models Without Grounding  Supervision|2024-10-10T17:59:55Z


3.3 自定义接口实现text2Video的API Wrapper

 
继承类 BaseAPI


入参

字段|类型|含义

input_dict| 字典| 处理API输入 text,image,audio,video字段
model| Huggingface的模型 Pytorch
kwargs| 额外参数

出参

output_dict| 字典| API输出的结果的dict,包含4个key text,image,audio,video字段


核心逻辑

model继承自 Huggingface的 text_to_video的 pipeline (https://huggingface.co/docs/diffusers/api/pipelines/text_to_video)


4. 相关代码库 Github/PyPI地址

https://github.com/rockingdingo/text2video
https://github.com/rockingdingo/text2audio
https://github.com/rockingdingo/image2video
https://github.com/rockingdingo/SuperAlignment
https://github.com/rockingdingo/SuperIntelligence
http://www.deepnlp.org/blog/introduction-to-text-to-video-generation-huggingface-pipeline-and-pypi-package-text2video
http://www.deepnlp.org/blog/introduction-to-multimodal-generative-modelshttps://huggingface.co/docs/diffusers/api/pipelines/text_to_video

5. AI应用用户评价User Reviews

OpenAI o1
http://www.deepnlp.org/store/pub/pub-openai-o1


ChatGPT User Reviews
http://www.deepnlp.org/store/pub/pub-chatgpt-openai

Gemini User Reviews
http://www.deepnlp.org/store/pub/pub-gemini-google

Perplexity User Reviews
http://www.deepnlp.org/store/pub/pub-perplexity

Claude User Reviews
http://www.deepnlp.org/store/pub/pub-claude-anthropic

Grok User Reviews
http://www.deepnlp.org/store/pub/pub-grok-xai

Midjourney User Reviews
http://www.deepnlp.org/store/pub/pub-midjourney

Stable Diffusion User Reviews
http://www.deepnlp.org/store/pub/pub-stable-diffusion


Runway User Reviews

Canva User Reviews
http://www.deepnlp.org/store/pub/pub-canva

GPT-5 Forecast
http://www.deepnlp.org/store/pub/pub-gpt-5

SearchGPT Reviews
http://www.deepnlp.org/store/pub/pub-searchgpt


Kling AI Reviews
http://www.deepnlp.org/store/pub/pub-kling-kwai

Dreamina AI Reviews
http://www.deepnlp.org/store/pub/pub-dreamina-douyin

Luma AI
http://www.deepnlp.org/store/pub/pub-luma-ai

Pika AI Reviews
http://www.deepnlp.org/store/pub/pub-pika

Runway AI Reviews
http://www.deepnlp.org/store/pub/pub-runway

Flux AI Reviews
http://www.deepnlp.org/store/pub/pub-flux-1-black-forest-lab

Qwen AI Reviews
http://www.deepnlp.org/store/pub/pub-qwen-alibaba

Zhipu AI Reviews
http://www.deepnlp.org/store/pub/pub-zhipu-ai


Doubao Reviews
http://www.deepnlp.org/store/pub/pub-doubao-douyin

Kimi Chat Reviews

http://www.deepnlp.org/store/pub/pub-kimi-ai


Coursera Reviews
http://www.deepnlp.org/store/pub/pub-coursera

Udacity Reviews
http://www.deepnlp.org/store/pub/pub-udacity

Grammarly Reviews
http://www.deepnlp.org/store/pub/pub-grammarly


ChatGPT Strawberry
http://www.deepnlp.org/store/pub/pub-chatgpt-strawberry

Google AR VR Headsets
http://www.deepnlp.org/store/pub/pub-google-ar-vr-headset


DeepNLP AI Tools
http://www.deepnlp.org/store/pub/pub-deepnlp-ai


## Robotics

Tesla Cybercab Robotaxi
http://www.deepnlp.org/store/pub/pub-tesla-cybercab


Tesla Optimus
http://www.deepnlp.org/store/pub/pub-tesla-optimus

Figure AI
http://www.deepnlp.org/store/pub/pub-figure-ai


Unitree Robotics Reviews
http://www.deepnlp.org/store/pub/pub-unitree-robotics

Waymo User Reviews
http://www.deepnlp.org/store/pub/pub-waymo-google

ANYbotics Reviews
http://www.deepnlp.org/store/pub/pub-anybotics


Boston Dynamics
http://www.deepnlp.org/store/pub/pub-boston-dynamic


## AI Widgets
Apple Glasses
http://www.deepnlp.org/store/pub/pub-apple-glasses

Meta Glasses
http://www.deepnlp.org/store/pub/pub-meta-glasses

Apple AR VR Headset
http://www.deepnlp.org/store/pub/pub-apple-ar-vr-headset


Google Glass
http://www.deepnlp.org/store/pub/pub-google-glass

Meta VR Headset
http://www.deepnlp.org/store/pub/pub-meta-vr-headset


## Social

Character AI
http://www.deepnlp.org/store/pub/pub-character-ai

## Self-Driving

BYD Seal
http://www.deepnlp.org/store/pub/pub-byd-seal

Tesla Model 3
http://www.deepnlp.org/store/pub/pub-tesla-model-3


BMW i4
http://www.deepnlp.org/store/pub/pub-bmw-i4

Baidu Apollo Reviews

http://www.deepnlp.org/store/pub/pub-baidu-apollo

Hyundai IONIQ 6
http://www.deepnlp.org/store/pub/pub-hyundai-ioniq-6


http://www.kler.cn/news/356447.html

相关文章:

  • 【微服务】微服务发现详解:构建高效分布式系统的关键
  • c# 里list和array的应用比较说明
  • SwanLab VSCode插件已发布,附使用教程
  • 不做邮箱投稿的奴隶,要做单位信息宣传考核计分投稿的主人
  • SpringBoot智慧外贸平台
  • 大数据-173 Elasticsearch 索引操作 增删改查 详细 JSON 操作
  • Java - SpringMVC
  • 反走样算法(MSAA、TAA、FXAA、DLSS)
  • HBASE介绍和使用
  • 计算机视觉中的最小二乘法:寻找完美交点和直线拟合
  • Unity Apple Vision Pro 保姆级开发教程-准备阶段
  • 基于langchain.js快速搭建AI-Agent
  • Mybatis Plus 查看组装的SQL条件的办法
  • 叉车安全防撞装置的作用
  • Spark任务OOM问题如何解决?
  • 【C语言】数据的输入格式
  • wordpress隐藏后台管理登录地址修改wp-admin确保WordPress网站后台安全
  • Linux进程信号(个人笔记)
  • 油烟净化器科技创新,造就绿色低碳餐饮生活
  • 机器学习|Pytorch实现天气预测