BBOX Detector (combined)¶
Documentation¶
- Class name:
BboxDetectorCombined_v2
- Category:
ImpactPack/Detector
- Output node:
False
This node combines bounding box detection with segmentation mask creation and optional dilation, providing a comprehensive solution for object detection and segmentation in images. It leverages a bounding box model to detect objects and generate segmentation masks, which can then be optionally dilated for improved coverage or specificity.
Input types¶
Required¶
bbox_detector
- Specifies the bounding box model to be used for object detection. It plays a crucial role in identifying objects within the image and generating initial bounding boxes.
- Comfy dtype:
BBOX_DETECTOR
- Python dtype:
str
image
- The input image on which object detection and segmentation are to be performed. It serves as the primary data source for the detection process.
- Comfy dtype:
IMAGE
- Python dtype:
torch.Tensor
threshold
- A threshold value to filter detected objects based on confidence scores. It helps in eliminating detections with low confidence.
- Comfy dtype:
FLOAT
- Python dtype:
float
dilation
- Determines the extent to which the segmentation masks are dilated. This can enhance the mask's coverage over the detected objects.
- Comfy dtype:
INT
- Python dtype:
int
Output types¶
mask
- Comfy dtype:
MASK
- The output segmentation mask representing detected objects. It combines all individual object masks into a single mask layer.
- Python dtype:
torch.Tensor
- Comfy dtype:
Usage tips¶
- Infra type:
GPU
- Common nodes: unknown
Source code¶
class BboxDetectorCombined(SegmDetectorCombined):
@classmethod
def INPUT_TYPES(s):
return {"required": {
"bbox_detector": ("BBOX_DETECTOR", ),
"image": ("IMAGE", ),
"threshold": ("FLOAT", {"default": 0.5, "min": 0.0, "max": 1.0, "step": 0.01}),
"dilation": ("INT", {"default": 4, "min": -512, "max": 512, "step": 1}),
}
}
def doit(self, bbox_detector, image, threshold, dilation):
mask = bbox_detector.detect_combined(image, threshold, dilation)
if mask is None:
mask = torch.zeros((image.shape[2], image.shape[1]), dtype=torch.float32, device="cpu")
return (mask.unsqueeze(0),)