🔧 Mask Bounding Box¶
Documentation¶
- Class name:
MaskBoundingBox+
- Category:
essentials/mask
- Output node:
False
This node is designed for processing masks to identify and extract bounding box coordinates, facilitating tasks that require spatial localization within images. It can be used for cropping, object detection, and area-specific processing, enhancing the utility of masks by providing precise location data.
Input types¶
Required¶
mask
- The input mask for which the bounding box is to be computed, representing a specific area within an image. The node calculates the bounding box that tightly encloses this area, providing spatial localization.
- Comfy dtype:
MASK
- Python dtype:
torch.Tensor
padding
- An optional parameter that specifies the amount of padding to add to the bounding box, allowing for adjustments to the size of the extracted area.
- Comfy dtype:
INT
- Python dtype:
int
blur
- An optional parameter that applies a blur to the mask before computing the bounding box, potentially smoothing the edges and affecting the final bounding box coordinates.
- Comfy dtype:
INT
- Python dtype:
int
Optional¶
image_optional
- An optional image input that, if provided, can be used in conjunction with the mask to refine the bounding box calculation based on the image content.
- Comfy dtype:
IMAGE
- Python dtype:
torch.Tensor
Output types¶
MASK
- Comfy dtype:
MASK
- The mask adjusted to the bounding box, potentially including padding and modifications from the blur parameter.
- Python dtype:
torch.Tensor
- Comfy dtype:
IMAGE
- Comfy dtype:
IMAGE
- An optional output image cropped to the bounding box of the mask, provided if an image input was given.
- Python dtype:
torch.Tensor
- Comfy dtype:
x
- Comfy dtype:
INT
- The x-coordinate of the top-left corner of the bounding box.
- Python dtype:
int
- Comfy dtype:
y
- Comfy dtype:
INT
- The y-coordinate of the top-left corner of the bounding box.
- Python dtype:
int
- Comfy dtype:
width
- Comfy dtype:
INT
- The width of the bounding box.
- Python dtype:
int
- Comfy dtype:
height
- Comfy dtype:
INT
- The height of the bounding box.
- Python dtype:
int
- Comfy dtype:
Usage tips¶
- Infra type:
GPU
- Common nodes: unknown
Source code¶
class MaskBoundingBox:
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"mask": ("MASK",),
"padding": ("INT", { "default": 0, "min": 0, "max": 4096, "step": 1, }),
"blur": ("INT", { "default": 0, "min": 0, "max": 256, "step": 1, }),
},
"optional": {
"image_optional": ("IMAGE",),
}
}
RETURN_TYPES = ("MASK", "IMAGE", "INT", "INT", "INT", "INT")
RETURN_NAMES = ("MASK", "IMAGE", "x", "y", "width", "height")
FUNCTION = "execute"
CATEGORY = "essentials/mask"
def execute(self, mask, padding, blur, image_optional=None):
if mask.dim() == 2:
mask = mask.unsqueeze(0)
if image_optional is None:
image_optional = mask.unsqueeze(3).repeat(1, 1, 1, 3)
# resize the image if it's not the same size as the mask
if image_optional.shape[1:] != mask.shape[1:]:
image_optional = comfy.utils.common_upscale(image_optional.permute([0,3,1,2]), mask.shape[2], mask.shape[1], upscale_method='bicubic', crop='center').permute([0,2,3,1])
# match batch size
if image_optional.shape[0] < mask.shape[0]:
image_optional = torch.cat((image_optional, image_optional[-1].unsqueeze(0).repeat(mask.shape[0]-image_optional.shape[0], 1, 1, 1)), dim=0)
elif image_optional.shape[0] > mask.shape[0]:
image_optional = image_optional[:mask.shape[0]]
# blur the mask
if blur > 0:
if blur % 2 == 0:
blur += 1
mask = T.functional.gaussian_blur(mask.unsqueeze(1), blur).squeeze(1)
_, y, x = torch.where(mask)
x1 = max(0, x.min().item() - padding)
x2 = min(mask.shape[2], x.max().item() + 1 + padding)
y1 = max(0, y.min().item() - padding)
y2 = min(mask.shape[1], y.max().item() + 1 + padding)
# crop the mask
mask = mask[:, y1:y2, x1:x2]
image_optional = image_optional[:, y1:y2, x1:x2, :]
return (mask, image_optional, x1, y1, x2 - x1, y2 - y1)