Fang, Shun (2024) A Comprehensive Survey of Text Encoders for Text-to-Image Diffusion Models. EAI Endorsed Transactions on AI and Robotics.
70317.pdf
Download (1MB)
Abstract
In this comprehensive survey, we delve into the realm of text encoders for text-to-image diffusion models, focusing on the principles, challenges, and opportunities associated with these encoders. We explore the state-of-the-art models, including BERT, T5-XXL, and CLIP, that have revolutionized the
| Item Type: | Article |
|---|---|
| Date Deposited: | 04 Mar 2026 18:09 |
| Last Modified: | 16 Apr 2026 22:26 |
| URI: | http://eprints.eai.eu/id/eprint/51387 |
