Warade, Purva Sandeep, Shinde, Swati, Virdee, Bal Singh and Khanna, Ashish (2025) Text to image synthesis using StackGAN. In: 2025 9th International Conference on Computing, Communication, Control and Automation (ICCCBEA), 22-23 August 2025, Pune, India.
Text-to-image synthesis is a difficult task. It involves the translation of descriptive text to visual content, which maps language and image representations. Among several approaches, StackGAN was one of the most notable frameworks because of the pioneer two-stage architecture. The first stage generated low-resolution images based on the global structure. Then, the first stage results were utilized and refined in the second stage into high-resolution, realistic images with improved details. This paper reports the performance of StackGAN on a subset of the Flickr dataset where embeddings of textual descriptions are obtained from the USE. Experimental results will show that the model produces semantically coherent images, which are also visually coherent. A study focuses on the prospect of StackGAN in creative content generation and discusses challenges such as maintaining diversity and mitigating artifacts.
Available under License Creative Commons Attribution 4.0.
Download (418kB) | Preview
![]() |
View Item |
Lists
Lists