NaiveAutoKMeansColor¶
Documentation¶
- Class name:
NaiveAutoKMeansColor
- Category:
Bmad/CV/Color A.
- Output node:
False
This node is designed to automatically determine the optimal number of colors for K-means clustering on an image, and then apply the K-means algorithm to segment the image into these colors. It simplifies the process of color quantization by abstracting away the need for manual specification of the number of clusters, making it easier to use for tasks such as image simplification or color analysis.
Input types¶
Required¶
image
- The input image on which K-means clustering will be performed. This is the primary data the node operates on to determine the optimal number of colors.
- Comfy dtype:
IMAGE
- Python dtype:
torch.Tensor
max_k
- The maximum number of colors (clusters) to consider for determining the optimal number through the algorithm. It sets an upper limit on the number of clusters to evaluate.
- Comfy dtype:
INT
- Python dtype:
int
rc_threshold
- A threshold for the relative compactness of clusters, used to help determine the optimal number of clusters by comparing the compactness of a cluster to the first computed compactness.
- Comfy dtype:
FLOAT
- Python dtype:
float
max_iterations
- The maximum number of iterations the K-means algorithm will execute, providing a stopping criterion for convergence.
- Comfy dtype:
INT
- Python dtype:
int
eps
- The epsilon value for convergence criteria, specifying the minimum amount of change required for an iteration to be considered as making progress.
- Comfy dtype:
FLOAT
- Python dtype:
float
Output types¶
image
- Comfy dtype:
IMAGE
- The output image after applying K-means clustering with the determined optimal number of colors, segmented into these colors.
- Python dtype:
torch.Tensor
- Comfy dtype:
int
- Comfy dtype:
INT
- The optimal number of colors (clusters) determined by the algorithm for the input image.
- Python dtype:
int
- Comfy dtype:
Usage tips¶
- Infra type:
GPU
- Common nodes: unknown
Source code¶
class NaiveAutoKMeansColor:
@classmethod
def INPUT_TYPES(cls):
return {"required": {
"image": ("IMAGE",),
"max_k": ("INT", {"default": 8, "min": 3, "max": 16}),
# besides looking like the elbow,
# a k's compactness divided the by first computed compactness should be below this value
"rc_threshold": ("FLOAT", {"default": .5, "max": 1, "min": 0.01, "step": 0.01}),
"max_iterations": ("INT", {"default": 100}),
"eps": ("FLOAT", {"default": .2, "step": 0.05})
}}
RETURN_TYPES = ("IMAGE", "INT")
FUNCTION = "get_colors"
CATEGORY = f"{cv_category_path}/Color A."
def get_colors(self, image, max_k, rc_threshold, max_iterations, eps):
image = tensor2opencv(image, 3)
pixels = image.reshape(-1, 3)
pixels = np.float32(pixels)
def normalize(vector):
return vector / np.linalg.norm(vector)
def compute_angle_at_k(prev_k_c, k_c, next_k_c):
p_km1 = np.array([-1, prev_k_c, 0])
p_k = np.array([0, k_c, 0])
p_kp1 = np.array([1, next_k_c, 0])
v1 = normalize(p_km1 - p_k)
v2 = normalize(p_kp1 - p_k)
return np.arccos(np.clip(np.dot(v1, v2), -1.0, 1.0))
# define criteria
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, max_iterations, eps)
# compute k means and check for the elbow
# here the elbow is the edgiest point on the compactness graph
best_angle = 7 # max is pi, when the line is perfectly straight; and the objective is to minimize the angle
max_c = best_rc = best_k = None
current_k_data = prev_k_data = best_k_data = None
for k in range(1, max_k + 2):
next_k_data = cv.kmeans(pixels, k, None, criteria, 10, cv.KMEANS_RANDOM_CENTERS)
if max_c is None:
max_c = next_k_data[0]
if next_k_data[0] == 0:
# if it is a perfect fit, leave the method
# avoids unneeded computation, and division by zero on k = 1
best_k_data = next_k_data
best_k = k
break
if k > 2:
rc = current_k_data[0] / max_c
angle = compute_angle_at_k(prev_k_data[0] / max_c, rc, next_k_data[0] / max_c)
if angle < best_angle or best_rc > rc_threshold:
best_angle = angle
best_k_data = current_k_data
best_rc = rc
best_k = k - 1
prev_k_data = current_k_data
current_k_data = next_k_data
# convert back into uint8, and make original image
center = np.uint8(best_k_data[2])
res = center[best_k_data[1].flatten()]
res2 = res.reshape((image.shape))
res2 = opencv2tensor(res2)
return (res2, best_k)