A Comprehensive Survey of Text Encoders for Text-to-Image Diffusion Models

Fang, Shun (2024) A Comprehensive Survey of Text Encoders for Text-to-Image Diffusion Models. EAI Endorsed Transactions on AI and Robotics.

[thumbnail of 70317.pdf] PDF
70317.pdf

Download (1MB)

Abstract

In this comprehensive survey, we delve into the realm of text encoders for text-to-image diffusion models, focusing on the principles, challenges, and opportunities associated with these encoders. We explore the state-of-the-art models, including BERT, T5-XXL, and CLIP, that have revolutionized the

Item Type: Article
Date Deposited: 04 Mar 2026 18:09
Last Modified: 16 Apr 2026 22:26
URI: http://eprints.eai.eu/id/eprint/51387

Actions (login required)

View Item
View Item