E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models

E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Modelshttps://github.com/mkshing/e4t-diffusion

A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training times, high storage requirements, and loss of identity. To address these issues, the proposed method utilizes an encoder-based domain-tuning approach. This approach involves underfitting on a large set of concepts from a given domain to improve generalization and enable quick addition of novel concepts from the same domain. The method consists of two components: an encoder that maps a target concept image to a word embedding representing the concept, and regularized weight-offsets for the text-to-image model that effectively incorporate additional concepts. By leveraging these components, the model can be personalized with a single image and as few as 5 training steps, significantly reducing personalization time while maintaining quality.

E4t-diffusion