LayerMask: Mediapipe Facial Segment¶

Documentation¶

Class name: LayerMask: MediapipeFacialSegment
Category: 😺dzNodes/LayerMask
Output node: False

This node leverages the Mediapipe library to perform facial feature segmentation on images. It extracts specific facial features based on user input, such as eyes, eyebrows, lips, and teeth, and generates corresponding masks for each feature.

Input types¶

Required¶

image
- The input image on which facial feature segmentation is to be performed. It is essential for identifying and extracting the facial features.
- Comfy dtype: IMAGE
- Python dtype: torch.Tensor
left_eye
- A boolean indicating whether to include the left eye in the facial feature segmentation. This affects the output mask by including or excluding the left eye region.
- Comfy dtype: BOOLEAN
- Python dtype: bool
left_eyebrow
- A boolean indicating whether to include the left eyebrow in the segmentation. This choice influences the resulting mask by determining if the left eyebrow area is segmented.
- Comfy dtype: BOOLEAN
- Python dtype: bool
right_eye
- A boolean parameter that controls whether the right eye's region is included in the segmentation output. It affects the mask generation process by including or excluding the right eye.
- Comfy dtype: BOOLEAN
- Python dtype: bool
right_eyebrow
- Determines whether the right eyebrow is included in the facial feature segmentation. This parameter influences the mask creation by specifying if the right eyebrow area should be segmented.
- Comfy dtype: BOOLEAN
- Python dtype: bool
lips
- Indicates whether the lips should be segmented in the facial feature extraction. This parameter affects the output by including or excluding the lips in the generated mask.
- Comfy dtype: BOOLEAN
- Python dtype: bool
tooth
- Controls whether the teeth are included in the segmentation. This affects the final mask by determining if the teeth area is segmented.
- Comfy dtype: BOOLEAN
- Python dtype: bool

Optional¶

Output types¶

image
- Comfy dtype: IMAGE
- The original image with facial features segmented based on the input parameters.
- Python dtype: torch.Tensor
mask
- Comfy dtype: MASK
- A mask image highlighting the segmented facial features. Each feature segmented based on the input parameters is represented in the mask.
- Python dtype: torch.Tensor

Usage tips¶

Infra type: GPU
Common nodes: unknown

Source code¶

class FacialFeatureSegment:

    def __init__(self):
        pass

    @classmethod
    def INPUT_TYPES(self):

        return {
            "required": {
                "image": ("IMAGE",),  #
                "left_eye": ("BOOLEAN", {"default": True}),
                "left_eyebrow": ("BOOLEAN", {"default": True}),
                "right_eye": ("BOOLEAN", {"default": True}),
                "right_eyebrow": ("BOOLEAN", {"default": True}),
                "lips": ("BOOLEAN", {"default": True}),
                "tooth": ("BOOLEAN", {"default": True}),
            },
            "optional": {
            }
        }

    RETURN_TYPES = ("IMAGE", "MASK",)
    RETURN_NAMES = ("image", "mask",)
    FUNCTION = 'facial_feature_segment'
    CATEGORY = '😺dzNodes/LayerMask'

    def facial_feature_segment(self, image,
                              left_eye, left_eyebrow, right_eye, right_eyebrow, lips, tooth
                  ):

        # 定义面部特征索引
        left_eye_indices = [33, 7, 163, 144, 145, 153, 154, 155, 133, 173, 157, 158, 159, 160, 161, 246]
        right_eye_indices = [263, 249, 390, 373, 374, 380, 381, 382, 362, 398, 384, 385, 386, 387, 388, 466]
        left_eyebrow_indices = [70, 63, 105, 66, 107, 55, 65, 52, 53, 46]
        right_eyebrow_indices = [336, 296, 334, 293, 300, 276, 283, 282, 295, 285]
        # upper_lip_indices = [61, 146, 91, 181, 84, 17, 314, 405, 321, 375, 291, 308, 324, 318, 402, 317, 14, 87, 178, 88, 95, 78]
        # lower_lip_indices = [61, 185, 40, 39, 37, 0, 267, 269, 270, 409, 291, 308, 415, 310, 311, 312, 13, 82, 81, 80, 191, 78]
        tooth_indices = [78, 95, 88, 178, 87, 14, 317, 402, 318, 324, 308, 415, 310, 311, 312, 13, 82, 81, 80, 191, 78]
        lips_indices = [61, 76, 62, 78, 191, 80, 81, 82, 13, 312, 311, 310, 415, 308, 324, 318, 402, 317, 14, 87, 178,
                         88, 95, 185, 40, 39, 37, 0, 267, 269, 270, 409, 291, 375, 321, 405, 314, 17, 84, 181, 91, 146,
                         61]

        ret_images = []
        ret_masks = []
        scale_factor = 4

        for i in image:
            face_image = tensor2pil(i.unsqueeze(0)).convert('RGB')
            width, height = face_image.size
            width *= scale_factor
            height *= scale_factor
            cv2_image = pil2cv2(face_image)
            mp_face_mesh = mp.solutions.face_mesh
            fase_mesh = mp_face_mesh.FaceMesh(static_image_mode=True, max_num_faces=1, min_detection_confidence=0.5)
            results = fase_mesh.process(cv2_image)
            mask = np.zeros((height, width), dtype=np.uint8)


            if results.multi_face_landmarks:
                for face_landmarks in results.multi_face_landmarks:
                    # 绘制各个面部特征
                    if left_eye:
                        draw_feature(left_eye_indices, mask, face_landmarks, width, height)
                    if right_eye:
                        draw_feature(right_eye_indices, mask, face_landmarks, width, height)
                    if left_eyebrow:
                        draw_feature(left_eyebrow_indices, mask, face_landmarks, width, height)
                    if right_eyebrow:
                        draw_feature(right_eyebrow_indices, mask, face_landmarks, width, height)
                    if lips:
                        draw_feature(lips_indices, mask, face_landmarks, width, height)
                    if tooth:
                        draw_feature(tooth_indices, mask, face_landmarks, width, height)

            mask = cv22pil(mask).convert('L')
            mask = gaussian_blur(mask, 2)
            mask = mask.resize(face_image.size, Image.BILINEAR)
            ret_images.append(pil2tensor(RGB2RGBA(face_image, mask)))
            ret_masks.append(image2mask(mask))

        log(f"{NODE_NAME} Processed {len(ret_images)} image(s).", message_type='finish')
        return (torch.cat(ret_images, dim=0), torch.cat(ret_masks, dim=0),)