NormalizedAmplitudeToMask¶

Documentation¶

Class name: NormalizedAmplitudeToMask
Category: KJNodes/audio
Output node: False

This node is designed to convert normalized amplitude values from audio signals into masks, applying a transformation that maps the amplitude range to a corresponding visual representation. It emphasizes the integration of audio data with visual elements, enabling dynamic adjustments to visual content based on audio input.

Input types¶

Required¶

normalized_amp
- The normalized amplitude values, expected to be in the range [0, 1], serve as the basis for generating masks. These values dictate the intensity and characteristics of the resulting visual masks, directly influencing the visual output.
- Comfy dtype: NORMALIZED_AMPLITUDE
- Python dtype: numpy.ndarray
width
- Specifies the width of the output mask, allowing for customization of the mask's dimensions based on the requirements of the visual representation.
- Comfy dtype: INT
- Python dtype: int
height
- Determines the height of the output mask, enabling adjustment of the mask's size to fit specific visual contexts.
- Comfy dtype: INT
- Python dtype: int
frame_offset
- An integer value used to offset the amplitude values, providing a means to shift the visual representation of the mask in relation to the audio input.
- Comfy dtype: INT
- Python dtype: int
location_x
- The x-coordinate location where the mask will be applied, allowing for precise positioning of the visual effect within the larger image or scene.
- Comfy dtype: INT
- Python dtype: int
location_y
- The y-coordinate location for the mask application, facilitating accurate placement of the audio-induced visual effect.
- Comfy dtype: INT
- Python dtype: int
size
- Defines the size of the mask, offering control over the scale of the visual effect generated from the audio amplitude.
- Comfy dtype: INT
- Python dtype: int
shape
- Allows selection of the mask's shape, providing options such as 'none', 'circle', 'square', and 'triangle' to customize the visual outcome.
- Comfy dtype: COMBO[STRING]
- Python dtype: str
color
- Chooses the color scheme of the mask, with options like 'white' and 'amplitude' to influence the visual appearance based on the audio input.
- Comfy dtype: COMBO[STRING]
- Python dtype: str

Output types¶

mask
- Comfy dtype: MASK
- The output mask generated from the normalized amplitude values, where the amplitude information is visually encoded into the mask's structure.
- Python dtype: torch.Tensor

Usage tips¶

Infra type: CPU
Common nodes: unknown

Source code¶

class NormalizedAmplitudeToMask:
    @classmethod
    def INPUT_TYPES(s):
        return {"required": {
                    "normalized_amp": ("NORMALIZED_AMPLITUDE",),
                    "width": ("INT", {"default": 512,"min": 16, "max": 4096, "step": 1}),
                    "height": ("INT", {"default": 512,"min": 16, "max": 4096, "step": 1}),
                    "frame_offset": ("INT", {"default": 0,"min": -255, "max": 255, "step": 1}),
                    "location_x": ("INT", {"default": 256,"min": 0, "max": 4096, "step": 1}),
                    "location_y": ("INT", {"default": 256,"min": 0, "max": 4096, "step": 1}),
                    "size": ("INT", {"default": 128,"min": 8, "max": 4096, "step": 1}),
                    "shape": (
                        [   
                            'none',
                            'circle',
                            'square',
                            'triangle',
                        ],
                        {
                        "default": 'none'
                        }),
                    "color": (
                        [   
                            'white',
                            'amplitude',
                        ],
                        {
                        "default": 'amplitude'
                        }),
                     },}

    CATEGORY = "KJNodes/audio"
    RETURN_TYPES = ("MASK",)
    FUNCTION = "convert"
    DESCRIPTION = """
Works as a bridge to the AudioScheduler -nodes:  
https://github.com/a1lazydog/ComfyUI-AudioScheduler  
Creates masks based on the normalized amplitude.
"""

    def convert(self, normalized_amp, width, height, frame_offset, shape, location_x, location_y, size, color):
        # Ensure normalized_amp is an array and within the range [0, 1]
        normalized_amp = np.clip(normalized_amp, 0.0, 1.0)

        # Offset the amplitude values by rolling the array
        normalized_amp = np.roll(normalized_amp, frame_offset)

        # Initialize an empty list to hold the image tensors
        out = []
        # Iterate over each amplitude value to create an image
        for amp in normalized_amp:
            # Scale the amplitude value to cover the full range of grayscale values
            if color == 'amplitude':
                grayscale_value = int(amp * 255)
            elif color == 'white':
                grayscale_value = 255
            # Convert the grayscale value to an RGB format
            gray_color = (grayscale_value, grayscale_value, grayscale_value)
            finalsize = size * amp

            if shape == 'none':
                shapeimage = Image.new("RGB", (width, height), gray_color)
            else:
                shapeimage = Image.new("RGB", (width, height), "black")

            draw = ImageDraw.Draw(shapeimage)
            if shape == 'circle' or shape == 'square':
                # Define the bounding box for the shape
                left_up_point = (location_x - finalsize, location_y - finalsize)
                right_down_point = (location_x + finalsize,location_y + finalsize)
                two_points = [left_up_point, right_down_point]

                if shape == 'circle':
                    draw.ellipse(two_points, fill=gray_color)
                elif shape == 'square':
                    draw.rectangle(two_points, fill=gray_color)

            elif shape == 'triangle':
                # Define the points for the triangle
                left_up_point = (location_x - finalsize, location_y + finalsize) # bottom left
                right_down_point = (location_x + finalsize, location_y + finalsize) # bottom right
                top_point = (location_x, location_y) # top point
                draw.polygon([top_point, left_up_point, right_down_point], fill=gray_color)

            shapeimage = pil2tensor(shapeimage)
            mask = shapeimage[:, :, :, 0]
            out.append(mask)

        return (torch.cat(out, dim=0),)