Skip to content

FaceDetailer

Documentation

  • Class name: FaceDetailer
  • Category: ImpactPack/Simple
  • Output node: False

The FaceDetailer node is designed to enhance the details of faces in images, utilizing a combination of models and techniques to refine facial features and improve image quality. It is specifically tailored for single images and warns against its use for video detailing, suggesting an alternative node for such purposes.

Input types

Required

  • image
    • The input image to be detailed. This is the primary data upon which the FaceDetailer operates, aiming to enhance facial features within the image.
    • Comfy dtype: IMAGE
    • Python dtype: torch.Tensor
  • model
    • The model used for enhancing the face details. It plays a crucial role in the processing and improvement of the image's facial features.
    • Comfy dtype: MODEL
    • Python dtype: torch.nn.Module
  • clip
    • A component used alongside the model to guide the enhancement process, ensuring that the modifications align with certain aesthetic or qualitative criteria.
    • Comfy dtype: CLIP
    • Python dtype: torch.nn.Module
  • vae
    • A variational autoencoder used in the enhancement process to generate or modify facial features, contributing to the overall improvement of the image.
    • Comfy dtype: VAE
    • Python dtype: torch.nn.Module
  • guide_size
    • Specifies the size of the guide images used in the enhancement process, affecting the detail level and the focus areas within the image.
    • Comfy dtype: FLOAT
    • Python dtype: int
  • guide_size_for
    • unknown
    • Comfy dtype: BOOLEAN
    • Python dtype: unknown
  • max_size
    • unknown
    • Comfy dtype: FLOAT
    • Python dtype: unknown
  • seed
    • A seed value for random number generation, ensuring reproducibility of the enhancement process.
    • Comfy dtype: INT
    • Python dtype: int
  • steps
    • The number of steps to perform in the enhancement process, affecting the intensity and detail of the enhancements.
    • Comfy dtype: INT
    • Python dtype: int
  • cfg
    • Configuration settings for the enhancement process, guiding the behavior of the model and other components.
    • Comfy dtype: FLOAT
    • Python dtype: dict
  • sampler_name
    • unknown
    • Comfy dtype: COMBO[STRING]
    • Python dtype: unknown
  • scheduler
    • unknown
    • Comfy dtype: COMBO[STRING]
    • Python dtype: unknown
  • positive
    • unknown
    • Comfy dtype: CONDITIONING
    • Python dtype: unknown
  • negative
    • unknown
    • Comfy dtype: CONDITIONING
    • Python dtype: unknown
  • denoise
    • unknown
    • Comfy dtype: FLOAT
    • Python dtype: unknown
  • feather
    • unknown
    • Comfy dtype: INT
    • Python dtype: unknown
  • noise_mask
    • unknown
    • Comfy dtype: BOOLEAN
    • Python dtype: unknown
  • force_inpaint
    • unknown
    • Comfy dtype: BOOLEAN
    • Python dtype: unknown
  • bbox_threshold
    • unknown
    • Comfy dtype: FLOAT
    • Python dtype: unknown
  • bbox_dilation
    • unknown
    • Comfy dtype: INT
    • Python dtype: unknown
  • bbox_crop_factor
    • unknown
    • Comfy dtype: FLOAT
    • Python dtype: unknown
  • sam_detection_hint
    • unknown
    • Comfy dtype: COMBO[STRING]
    • Python dtype: unknown
  • sam_dilation
    • unknown
    • Comfy dtype: INT
    • Python dtype: unknown
  • sam_threshold
    • unknown
    • Comfy dtype: FLOAT
    • Python dtype: unknown
  • sam_bbox_expansion
    • unknown
    • Comfy dtype: INT
    • Python dtype: unknown
  • sam_mask_hint_threshold
    • unknown
    • Comfy dtype: FLOAT
    • Python dtype: unknown
  • sam_mask_hint_use_negative
    • unknown
    • Comfy dtype: COMBO[STRING]
    • Python dtype: unknown
  • drop_size
    • unknown
    • Comfy dtype: INT
    • Python dtype: unknown
  • bbox_detector
    • unknown
    • Comfy dtype: BBOX_DETECTOR
    • Python dtype: unknown
  • wildcard
    • unknown
    • Comfy dtype: STRING
    • Python dtype: unknown
  • cycle
    • unknown
    • Comfy dtype: INT
    • Python dtype: unknown

Optional

  • sam_model_opt
    • unknown
    • Comfy dtype: SAM_MODEL
    • Python dtype: unknown
  • segm_detector_opt
    • unknown
    • Comfy dtype: SEGM_DETECTOR
    • Python dtype: unknown
  • detailer_hook
    • unknown
    • Comfy dtype: DETAILER_HOOK
    • Python dtype: unknown
  • inpaint_model
    • unknown
    • Comfy dtype: BOOLEAN
    • Python dtype: unknown
  • noise_mask_feather
    • unknown
    • Comfy dtype: INT
    • Python dtype: unknown
  • scheduler_func_opt
    • unknown
    • Comfy dtype: SCHEDULER_FUNC
    • Python dtype: unknown

Output types

  • image
    • Comfy dtype: IMAGE
    • The enhanced image with improved facial details. This is the primary output of the FaceDetailer node, showcasing the refined facial features.
    • Python dtype: torch.Tensor
  • cropped_refined
    • Comfy dtype: IMAGE
    • A list of cropped and refined portions of the image, focusing specifically on detailed facial areas.
    • Python dtype: List[torch.Tensor]
  • cropped_enhanced_alpha
    • Comfy dtype: IMAGE
    • unknown
    • Python dtype: unknown
  • mask
    • Comfy dtype: MASK
    • A mask generated during the enhancement process, potentially used for further refinement or processing.
    • Python dtype: torch.Tensor
  • detailer_pipe
    • Comfy dtype: DETAILER_PIPE
    • A tuple containing the configuration and models used in the enhancement process, facilitating reuse or analysis.
    • Python dtype: Tuple
  • cnet_images
    • Comfy dtype: IMAGE
    • A collection of images processed through a control network, potentially used for further refinement or as part of the enhancement pipeline.
    • Python dtype: List[PIL.Image]

Usage tips

Source code

class FaceDetailer:
    @classmethod
    def INPUT_TYPES(s):
        return {"required": {
                     "image": ("IMAGE", ),
                     "model": ("MODEL",),
                     "clip": ("CLIP",),
                     "vae": ("VAE",),
                     "guide_size": ("FLOAT", {"default": 512, "min": 64, "max": nodes.MAX_RESOLUTION, "step": 8}),
                     "guide_size_for": ("BOOLEAN", {"default": True, "label_on": "bbox", "label_off": "crop_region"}),
                     "max_size": ("FLOAT", {"default": 1024, "min": 64, "max": nodes.MAX_RESOLUTION, "step": 8}),
                     "seed": ("INT", {"default": 0, "min": 0, "max": 0xffffffffffffffff}),
                     "steps": ("INT", {"default": 20, "min": 1, "max": 10000}),
                     "cfg": ("FLOAT", {"default": 8.0, "min": 0.0, "max": 100.0}),
                     "sampler_name": (comfy.samplers.KSampler.SAMPLERS,),
                     "scheduler": (core.SCHEDULERS,),
                     "positive": ("CONDITIONING",),
                     "negative": ("CONDITIONING",),
                     "denoise": ("FLOAT", {"default": 0.5, "min": 0.0001, "max": 1.0, "step": 0.01}),
                     "feather": ("INT", {"default": 5, "min": 0, "max": 100, "step": 1}),
                     "noise_mask": ("BOOLEAN", {"default": True, "label_on": "enabled", "label_off": "disabled"}),
                     "force_inpaint": ("BOOLEAN", {"default": True, "label_on": "enabled", "label_off": "disabled"}),

                     "bbox_threshold": ("FLOAT", {"default": 0.5, "min": 0.0, "max": 1.0, "step": 0.01}),
                     "bbox_dilation": ("INT", {"default": 10, "min": -512, "max": 512, "step": 1}),
                     "bbox_crop_factor": ("FLOAT", {"default": 3.0, "min": 1.0, "max": 10, "step": 0.1}),

                     "sam_detection_hint": (["center-1", "horizontal-2", "vertical-2", "rect-4", "diamond-4", "mask-area", "mask-points", "mask-point-bbox", "none"],),
                     "sam_dilation": ("INT", {"default": 0, "min": -512, "max": 512, "step": 1}),
                     "sam_threshold": ("FLOAT", {"default": 0.93, "min": 0.0, "max": 1.0, "step": 0.01}),
                     "sam_bbox_expansion": ("INT", {"default": 0, "min": 0, "max": 1000, "step": 1}),
                     "sam_mask_hint_threshold": ("FLOAT", {"default": 0.7, "min": 0.0, "max": 1.0, "step": 0.01}),
                     "sam_mask_hint_use_negative": (["False", "Small", "Outter"],),

                     "drop_size": ("INT", {"min": 1, "max": MAX_RESOLUTION, "step": 1, "default": 10}),

                     "bbox_detector": ("BBOX_DETECTOR", ),
                     "wildcard": ("STRING", {"multiline": True, "dynamicPrompts": False}),

                     "cycle": ("INT", {"default": 1, "min": 1, "max": 10, "step": 1}),
                     },
                "optional": {
                    "sam_model_opt": ("SAM_MODEL", ),
                    "segm_detector_opt": ("SEGM_DETECTOR", ),
                    "detailer_hook": ("DETAILER_HOOK",),
                    "inpaint_model": ("BOOLEAN", {"default": False, "label_on": "enabled", "label_off": "disabled"}),
                    "noise_mask_feather": ("INT", {"default": 20, "min": 0, "max": 100, "step": 1}),
                    "scheduler_func_opt": ("SCHEDULER_FUNC",),
                }}

    RETURN_TYPES = ("IMAGE", "IMAGE", "IMAGE", "MASK", "DETAILER_PIPE", "IMAGE")
    RETURN_NAMES = ("image", "cropped_refined", "cropped_enhanced_alpha", "mask", "detailer_pipe", "cnet_images")
    OUTPUT_IS_LIST = (False, True, True, False, False, True)
    FUNCTION = "doit"

    CATEGORY = "ImpactPack/Simple"

    @staticmethod
    def enhance_face(image, model, clip, vae, guide_size, guide_size_for_bbox, max_size, seed, steps, cfg, sampler_name, scheduler,
                     positive, negative, denoise, feather, noise_mask, force_inpaint,
                     bbox_threshold, bbox_dilation, bbox_crop_factor,
                     sam_detection_hint, sam_dilation, sam_threshold, sam_bbox_expansion, sam_mask_hint_threshold,
                     sam_mask_hint_use_negative, drop_size,
                     bbox_detector, segm_detector=None, sam_model_opt=None, wildcard_opt=None, detailer_hook=None,
                     refiner_ratio=None, refiner_model=None, refiner_clip=None, refiner_positive=None, refiner_negative=None, cycle=1,
                     inpaint_model=False, noise_mask_feather=0, scheduler_func_opt=None):

        # make default prompt as 'face' if empty prompt for CLIPSeg
        bbox_detector.setAux('face')
        segs = bbox_detector.detect(image, bbox_threshold, bbox_dilation, bbox_crop_factor, drop_size, detailer_hook=detailer_hook)
        bbox_detector.setAux(None)

        # bbox + sam combination
        if sam_model_opt is not None:
            sam_mask = core.make_sam_mask(sam_model_opt, segs, image, sam_detection_hint, sam_dilation,
                                          sam_threshold, sam_bbox_expansion, sam_mask_hint_threshold,
                                          sam_mask_hint_use_negative, )
            segs = core.segs_bitwise_and_mask(segs, sam_mask)

        elif segm_detector is not None:
            segm_segs = segm_detector.detect(image, bbox_threshold, bbox_dilation, bbox_crop_factor, drop_size)

            if (hasattr(segm_detector, 'override_bbox_by_segm') and segm_detector.override_bbox_by_segm and
                    not (detailer_hook is not None and not hasattr(detailer_hook, 'override_bbox_by_segm'))):
                segs = segm_segs
            else:
                segm_mask = core.segs_to_combined_mask(segm_segs)
                segs = core.segs_bitwise_and_mask(segs, segm_mask)

        if len(segs[1]) > 0:
            enhanced_img, _, cropped_enhanced, cropped_enhanced_alpha, cnet_pil_list, new_segs = \
                DetailerForEach.do_detail(image, segs, model, clip, vae, guide_size, guide_size_for_bbox, max_size, seed, steps, cfg,
                                          sampler_name, scheduler, positive, negative, denoise, feather, noise_mask,
                                          force_inpaint, wildcard_opt, detailer_hook,
                                          refiner_ratio=refiner_ratio, refiner_model=refiner_model,
                                          refiner_clip=refiner_clip, refiner_positive=refiner_positive,
                                          refiner_negative=refiner_negative,
                                          cycle=cycle, inpaint_model=inpaint_model, noise_mask_feather=noise_mask_feather, scheduler_func_opt=scheduler_func_opt)
        else:
            enhanced_img = image
            cropped_enhanced = []
            cropped_enhanced_alpha = []
            cnet_pil_list = []

        # Mask Generator
        mask = core.segs_to_combined_mask(segs)

        if len(cropped_enhanced) == 0:
            cropped_enhanced = [empty_pil_tensor()]

        if len(cropped_enhanced_alpha) == 0:
            cropped_enhanced_alpha = [empty_pil_tensor()]

        if len(cnet_pil_list) == 0:
            cnet_pil_list = [empty_pil_tensor()]

        return enhanced_img, cropped_enhanced, cropped_enhanced_alpha, mask, cnet_pil_list

    def doit(self, image, model, clip, vae, guide_size, guide_size_for, max_size, seed, steps, cfg, sampler_name, scheduler,
             positive, negative, denoise, feather, noise_mask, force_inpaint,
             bbox_threshold, bbox_dilation, bbox_crop_factor,
             sam_detection_hint, sam_dilation, sam_threshold, sam_bbox_expansion, sam_mask_hint_threshold,
             sam_mask_hint_use_negative, drop_size, bbox_detector, wildcard, cycle=1,
             sam_model_opt=None, segm_detector_opt=None, detailer_hook=None, inpaint_model=False, noise_mask_feather=0, scheduler_func_opt=None):

        result_img = None
        result_mask = None
        result_cropped_enhanced = []
        result_cropped_enhanced_alpha = []
        result_cnet_images = []

        if len(image) > 1:
            print(f"[Impact Pack] WARN: FaceDetailer is not a node designed for video detailing. If you intend to perform video detailing, please use Detailer For AnimateDiff.")

        for i, single_image in enumerate(image):
            enhanced_img, cropped_enhanced, cropped_enhanced_alpha, mask, cnet_pil_list = FaceDetailer.enhance_face(
                single_image.unsqueeze(0), model, clip, vae, guide_size, guide_size_for, max_size, seed + i, steps, cfg, sampler_name, scheduler,
                positive, negative, denoise, feather, noise_mask, force_inpaint,
                bbox_threshold, bbox_dilation, bbox_crop_factor,
                sam_detection_hint, sam_dilation, sam_threshold, sam_bbox_expansion, sam_mask_hint_threshold,
                sam_mask_hint_use_negative, drop_size, bbox_detector, segm_detector_opt, sam_model_opt, wildcard, detailer_hook,
                cycle=cycle, inpaint_model=inpaint_model, noise_mask_feather=noise_mask_feather, scheduler_func_opt=scheduler_func_opt)

            result_img = torch.cat((result_img, enhanced_img), dim=0) if result_img is not None else enhanced_img
            result_mask = torch.cat((result_mask, mask), dim=0) if result_mask is not None else mask
            result_cropped_enhanced.extend(cropped_enhanced)
            result_cropped_enhanced_alpha.extend(cropped_enhanced_alpha)
            result_cnet_images.extend(cnet_pil_list)

        pipe = (model, clip, vae, positive, negative, wildcard, bbox_detector, segm_detector_opt, sam_model_opt, detailer_hook, None, None, None, None)
        return result_img, result_cropped_enhanced, result_cropped_enhanced_alpha, result_mask, pipe, result_cnet_images