

PhotoMaker
8.3k 639What is PhotoMaker ?
PhotoMaker allow you to Customize Realistic Human Photos via Stacked ID Embedding
It’s the official implementation of PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding.
PhotoMaker Features
- Rapid customization within seconds, with no additional LoRA training.
- Ensures impressive ID fidelity, offering diversity, promising text controllability, and high-quality generation.
- Can serve as an Adapter to collaborate with other Base Models alongside LoRA modules in community.
Examples
Realistic generation
Stylization generation
Note: only change the base model and add the LoRA modules for better stylization
🔧 Dependencies and Installation
- Python >= 3.8 (Recommend to use Anaconda or Miniconda)
- PyTorch >= 2.0.0
conda create --name photomaker python=3.10conda activate photomakerpip install -U pip
# Install requirementspip install -r requirements.txt
# Install photomakerpip install git+https://github.com/TencentARC/PhotoMaker.git
Then you can run the following command to use it
from photomaker import PhotoMakerStableDiffusionXLPipeline
⏬ Download Models
The model will be automatically downloaded through the following two lines:
from huggingface_hub import hf_hub_downloadphotomaker_path = hf_hub_download(repo_id="TencentARC/PhotoMaker", filename="photomaker-v1.bin", repo_type="model")
You can also choose to download manually from this url.
💻 How to Test
Use like diffusers
- Dependency
import torchimport osfrom diffusers.utils import load_imagefrom diffusers import EulerDiscreteSchedulerfrom photomaker import PhotoMakerStableDiffusionXLPipeline
### Load base modelpipe = PhotoMakerStableDiffusionXLPipeline.from_pretrained( base_model_path, # can change to any base model based on SDXL torch_dtype=torch.bfloat16, use_safetensors=True, variant="fp16").to(device)
### Load PhotoMaker checkpointpipe.load_photomaker_adapter( os.path.dirname(photomaker_path), subfolder="", weight_name=os.path.basename(photomaker_path), trigger_word="img" # define the trigger word)
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)
### Also can cooperate with other LoRA modules# pipe.load_lora_weights(os.path.dirname(lora_path), weight_name=lora_model_name, adapter_name="xl_more_art-full")# pipe.set_adapters(["photomaker", "xl_more_art-full"], adapter_weights=[1.0, 0.5])
pipe.fuse_lora()
- Input ID Images
### define the input ID imagesinput_folder_name = './examples/newton_man'image_basename_list = os.listdir(input_folder_name)image_path_list = sorted([os.path.join(input_folder_name, basename) for basename in image_basename_list])
input_id_images = []for image_path in image_path_list: input_id_images.append(load_image(image_path))
- Generation
# Note that the trigger word `img` must follow the class word for personalizationprompt = "a half-body portrait of a man img wearing the sunglasses in Iron man suit, best quality"negative_prompt = "(asymmetry, worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), open mouth, grayscale"generator = torch.Generator(device=device).manual_seed(42)images = pipe( prompt=prompt, input_id_images=input_id_images, negative_prompt=negative_prompt, num_images_per_prompt=1, num_inference_steps=num_steps, start_merge_step=10, generator=generator,).images[0]gen_images.save('out_photomaker.png')
Start a local gradio demo
Run the following command:
python gradio_demo/app.py
You could customize this script in this file.
If you want to run it on MAC, you should follow this Instruction and then run the app.py.
Usage Tips:
- Upload more photos of the person to be customized to improve ID fidelity. If the input is Asian face(s), maybe consider adding ‘Asian’ before the class word, e.g.,
Asian woman img
- When stylizing, does the generated face look too realistic? Adjust the Style strength to 30-50, the larger the number, the less ID fidelity, but the stylization ability will be better. You could also try out other base models or LoRAs with good stylization effects.
- Reduce the number of generated images and sampling steps for faster speed. However, please keep in mind that reducing the sampling steps may compromise the ID fidelity.