Image synthesis refers to the process of generating new images from an existing dataset, with the objective of creating images that closely... Show moreImage synthesis refers to the process of generating new images from an existing dataset, with the objective of creating images that closely resemble the target images, learned from the source data distribution. This technique has a wide range of applications, including transforming captions into images, deblurring blurred images, and enhancing low-resolution images. In recent years, deep learning techniques, particularly Generative Adversarial Network (GAN), has achieved significant success in this field. GAN consists of a generator (G) and a discriminator (D) and employ adversarial learning to synthesize images. Researchers have developed various strategies to improve GAN performance, such as controlling learning rates for different models and modifying the loss functions. This thesis focuses on image synthesis from captions using GANs and aims to improve the quality of generated images. The study is divided into four main parts:In the first part, we investigate the LSTM conditional GAN which is to generate images from captions. We use the word2vec as the caption features and combine these features’ information by LSTM and generate images via conditional GAN. In the second part, to improve the quality of generated images, we address the issue of convergence speed and enhance GAN performance using an adaptive WGAN update strategy. We demonstrate that this update strategy is applicable to Wasserstein GAN(WGAN) and other GANs that utilize WGAN-related loss functions. The proposed update strategy is based on a loss change ratio comparison between G and D. In the third part, to further enhance the quality of synthesized images, we investigate a transformer-based Uformer GAN for image restoration and propose a two-step refinement strategy. Initially, we train a Uformer model until convergence, followed by training a Uformer GAN using the restoration results obtained from the first step.In the fourth part, to generate fine-grained image from captions, we delve into the Recurrent Affine Transformation (RAT) GAN for fine-grained text-to-image synthesis. By incorporating an auxiliary classifier in the discriminator and employing a contrastive learning method, we improve the accuracy and fine-grained details of the synthesized images.Throughout this thesis, we strive to enhance the capabilities of GANs in various image synthesis applications and contribute valuable insights to the field of deep learning and image processing. Show less