Workflow

You can download the JSON file below and load it into ComfyUI.

A subgraph version has also been added to the JSON.

However, the subgraph may not work correctly depending on the environment, so further validation is needed.

Main Explanation

The structure is basically similar to Impact-Pack’s FaceDetailer, so it uses many nodes from Impact-Pack.

Initially, I tried using KJNodes’ Crop-Uncrop, but I wasn’t able to handle multiple people well with it, so I switched approach.

The workflow is mainly divided into input image preparation, masking, and detailing.

Since input image processing covers video upscaling or VFI, I will omit that and focus on masking and detailing.

For detection, the workflow mainly uses the Simple Detector for Video (SEGS) node.

Originally, the SAM2 Video Detector node was considered, but due to several limitations, I abandoned that approach.

Most settings can remain default. The method of pasting differs depending on whether you add only bbox or bbox+masking_mode.