Description
Hi @xiaohu2015,
Can you please help in understanding the task of face editing used in IP-Adapter?
While training the IP adapter for the image editing task on the face.
In which of the options below is training happening for IP-Adapter?
Option 1:
Image (which is used for denoising): Face image with sunglasses
Face ID (which is passed as input to the image encoder): Do these embeddings belong to the face of the same person used for denoising, but without sunglasses?
Text (for input to text encoder): "a person wearing sunglasses"
Option 2:
Image (which is used for denoising): Face image with sunglasses
Face ID (which is passed as input to the image encoder): Do these embeddings belong to the face of the same person used for denoising, while wearing sunglasses?
Text (for input to text encoder): "a person wearing sunglasses"
Thanks,