Diffusers Pipeline 实战 + 速度/显存测试记录

安装（使用源码构建）

官方文档建议为 GLM-Image 从源码安装 Transformers + Diffusers。（GitHub）

pip install git+https://github.com/huggingface/transformers.git
pip install git+https://github.com/huggingface/diffusers.git

最小化的文生图代码

import torch
from diffusers.pipelines.glm_image import GlmImagePipeline

pipe = GlmImagePipeline.from_pretrained(
    "zai-org/GLM-Image",
    torch_dtype=torch.bfloat16,
    device_map="cuda",
)

prompt = '现代海报，标题为"春季促销"，CTA为"立即购买".'
img = pipe(
    prompt=prompt,
    width=1024,
    height=1024,
    num_inference_steps=50,
    guidance_scale=1.5,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]

img.save("glm_image.png")

Pipeline 类和调用签名在 Diffusers 中有文档说明。（Hugging Face）

重要默认值（不要"过度固定"随机性）

Diffusers 说明 AR 部分使用采样默认值（例如 do_sample=True + temperature），并建议不要强制确定性解码，因为这可能导致退化输出。（Hugging Face）

分辨率规则

目标宽度/高度应该是 32 的倍数，否则会报错。（GitHub）

速度和显存现实检查（H100 参考）

GLM-Image 仓库包含了在单张 H100 上测量的端到端时间 + 峰值显存（Diffusers）。（GitHub）

示例：

1024×1024, batch 1 (T2I): ~64秒, ~37.8GB 显存（GitHub）
512×512, batch 1 (T2I): ~27秒, ~34.3GB 显存（GitHub）

他们还指出推理优化仍然有限，实际使用可能需要非常大的显存。（GitHub）

大多数人的实用设置

从 512 或 768 开始
使用 guidance_scale ~ 1.5–4.0（fal.ai 建议此范围以获得平衡）（Fal.ai）
保持海报文本块简短且有结构

安装（使用源码构建）

官方文档建议为 GLM-Image 从源码安装 Transformers + Diffusers。（GitHub）

pip install git+https://github.com/huggingface/transformers.git
pip install git+https://github.com/huggingface/diffusers.git

最小化的文生图代码

import torch
from diffusers.pipelines.glm_image import GlmImagePipeline

pipe = GlmImagePipeline.from_pretrained(
    "zai-org/GLM-Image",
    torch_dtype=torch.bfloat16,
    device_map="cuda",
)

prompt = '现代海报，标题为"春季促销"，CTA为"立即购买".'
img = pipe(
    prompt=prompt,
    width=1024,
    height=1024,
    num_inference_steps=50,
    guidance_scale=1.5,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]

img.save("glm_image.png")

Pipeline 类和调用签名在 Diffusers 中有文档说明。（Hugging Face）

重要默认值（不要"过度固定"随机性）

Diffusers 说明 AR 部分使用采样默认值（例如 do_sample=True + temperature），并建议不要强制确定性解码，因为这可能导致退化输出。（Hugging Face）

分辨率规则

目标宽度/高度应该是 32 的倍数，否则会报错。（GitHub）

速度和显存现实检查（H100 参考）

GLM-Image 仓库包含了在单张 H100 上测量的端到端时间 + 峰值显存（Diffusers）。（GitHub）

示例：

1024×1024, batch 1 (T2I): ~64秒, ~37.8GB 显存（GitHub）
512×512, batch 1 (T2I): ~27秒, ~34.3GB 显存（GitHub）

他们还指出推理优化仍然有限，实际使用可能需要非常大的显存。（GitHub）

大多数人的实用设置

从 512 或 768 开始
使用 guidance_scale ~ 1.5–4.0（fal.ai 建议此范围以获得平衡）（Fal.ai）
保持海报文本块简短且有结构

安装（使用源码构建）

最小化的文生图代码

重要默认值（不要"过度固定"随机性）

分辨率规则

速度和显存现实检查（H100 参考）

大多数人的实用设置

作者

分类

更多文章

GLM-Image 与室内设计：用文字视觉化空间

AR + 扩散混合架构详解（带图表）

GLM-Image 布局关键词速查表：掌握 Prompt 中的空间控制

Diffusers Pipeline 实战 + 速度/显存测试记录

安装（使用源码构建）

最小化的文生图代码

重要默认值（不要"过度固定"随机性）

分辨率规则

速度和显存现实检查（H100 参考）

大多数人的实用设置

作者

分类

更多文章

GLM-Image 与室内设计：用文字视觉化空间

AR + 扩散混合架构详解（带图表）

GLM-Image 布局关键词速查表：掌握 Prompt 中的空间控制