当前位置：首页 > article >正文

解决diffusers加载stablediffusion模型，输入prompt总是报错token数超出clip最大长度限制

article 2025/3/22 13:22:28

1. StableDiffusion1.5

在加载huggingface中的扩散模型时，输入prompt总是会被报错超过clip的最大长度限制。
解决方案：使用compel库

from diffusers import AutoPipelineForText2Image
import torch
import pdb
from compel import Compel

device = torch.device("cuda:3")
# 大模型
model_path = "/data1/zhikun.zhao/huggingface_test/hubd/stable-diffusion-v1-5"
pipeline = AutoPipelineForText2Image.from_pretrained(
	model_path, torch_dtype=torch.float32
).to(device)

# 设置lora
pipeline.load_lora_weights("/data1/zhikun.zhao/huggingface_test/hubd/adapter/c_adapt1", weight_name="zhenshi.safetensors", adapter_name = "zhenshi")

#保证重复性和可复现性
generator = torch.Generator("cuda:3").manual_seed(31)

prompt = "score_7_up, realhuman, photo_\\(medium\\), (dreamy, haze:1.2), (shot on GoPro hero:1.3), instagram, ultra-realistic, high quality, high resolution, RAW photo, 8k, 4k, soft shadows, artistic, shy, bashful, innocent, interior, dramatic, dynamic composition, 18yo woman, medium shot, closeup, petite 18-year-old woman, (hazel eyes,lip piercing,long silver straight hairs,Layered Curls cut, effect ,Sad expression, Downturned mouth, drooping eyelids, furrowed brows:0.8), wearing a figure-hugging dress with a plunging neckline and lace details, paired with black opaque tights pantyhose and knee-high leather boots, The look is bold and daring, perfect for a night out, detailed interior space, "
negative_prompt = "score_1, skinny, slim, ribs, abs, 2girls, piercings, bimbo breasts, professional, bokeh, blurry, text"

compel = Compel(tokenizer = pipeline.tokenizer, text_encoder = pipeline.text_encoder)
conditioning = compel.build_conditioning_tensor(prompt)
negative_conditioning = compel.build_conditioning_tensor(negative_prompt) # .build_conditioning_tensor()和()通用
[conditioning, negative_conditioning] = compel.pad_conditioning_tensors_to_same_length([conditioning, negative_conditioning])


out = pipeline(prompt_embeds = conditioning,
    num_images_per_prompt = 1, generator=generator, num_inference_steps = 50, # 建议步数50就可以
    height = 1024, width = 1024,
    guidance_scale = 7   # 文字相关度，这个值越高，生成图像就跟文字提示越接近，但是值太大效果就不好了。
)
image = out.images[0]
image.save("img/test.png")

2. StableDiffusionXL1.0

上述解决方案在加载SDXL1.0模型的时候提示：输入prompt_embeds的同时应该输入pooled_prompt_embeds。
修改部分上述代码如下：

out = pipeline(prompt_embeds = conditioning[0], pooled_prompt_embeds = conditioning[1],
    negative_prompt_embeds = negative_conditioning[0], negative_pooled_prompt_embeds = negative_conditioning[1],
    num_images_per_prompt = 1, generator=generator, num_inference_steps = 50, # 建议步数50就可以
    height = 1024, width = 768,
    guidance_scale = 3   # 文字相关度，这个值越高，生成图像就跟文字提示越接近，但是值太大效果就不好了。
)

查看全文

原文地址:https://blog.csdn.net/hututufandou/article/details/146314935
本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.kler.cn/a/593036.html 如若内容造成侵权/违法违规/事实不符，请联系邮箱：809451989@qq.com进行投诉反馈，一经查实，立即删除！

车载以太网网络测试-16【传输层-UDP】

JSON数据格式介绍

KUKA机器人信息编程程序

LeetCode[124] 二叉树中的最大路径和

Blender制作次表面材质

AI代理到底怎么玩？

IIS 服务器日志和性能监控

J2EE实现规范

智慧加油站小程序数据库设计文档

K8s集群的环境部署

视频对讲系统中，强插和强拆；视频分发功能

汽车一键启动PKE无钥匙系统

学习TensorFlow前的NumPy核心知识点

AI 时代，学习 Java 应如何入手？

Python pyqt+flask做一个简单实用的自动排班系统

Conda 虚拟环境创建：加不加 Python 版本的深度剖析

十四、OSG学习笔记-事件响应

Qt 控件概述 QWdiget 1.1

事件系统简介+Button组件+Toggle简介

一次Linux下 .net 调试经历

1. StableDiffusion1.5

2. StableDiffusionXL1.0

相关文章：