Skip to content

Easy Apply IPAdapter (Encoder)

Documentation

  • Class name: easy ipadapterApplyEncoder
  • Category: EasyUse/Adapter
  • Output node: False

The easy ipadapterApplyEncoder node is designed to encode images into positive and negative embeddings using an IPAdapter model. It processes multiple images, applying weights and masks to each, and combines the resulting embeddings according to a specified method. This node facilitates the integration of image features into models by generating embeddings that can be used to enhance or suppress certain aspects of the image in downstream tasks.

Input types

Required

  • model
    • The model parameter represents the base model to which the IPAdapter encoding process will be applied. It is crucial for ensuring that the encoding is compatible with the model's architecture and processing capabilities.
    • Comfy dtype: MODEL
    • Python dtype: str
  • clip_vision
    • The clip_vision parameter is essential for providing vision-related features or models that the IPAdapter might utilize during the encoding process. It enhances the encoding by incorporating visual context.
    • Comfy dtype: CLIP_VISION
    • Python dtype: str
  • image1
    • The first image to be encoded by the IPAdapter, contributing to the generation of positive and negative embeddings.
    • Comfy dtype: IMAGE
    • Python dtype: str
  • preset
    • Specifies the preset configuration for the IPAdapter encoding process, affecting how images are processed and encoded.
    • Comfy dtype: COMBO[STRING]
    • Python dtype: str
  • num_embeds
    • The number of embeddings to generate, dictating how many images will be processed by the IPAdapter.
    • Comfy dtype: INT
    • Python dtype: int

Optional

  • image2
    • The second image to be encoded, if applicable, based on the num_embeds parameter.
    • Comfy dtype: IMAGE
    • Python dtype: str
  • image3
    • The third image to be encoded, used when num_embeds is greater than two.
    • Comfy dtype: IMAGE
    • Python dtype: str
  • image4
    • The fourth image to be encoded, used when num_embeds is four.
    • Comfy dtype: IMAGE
    • Python dtype: str
  • mask1
    • An optional mask for the first image, used to focus or exclude specific areas during encoding.
    • Comfy dtype: MASK
    • Python dtype: str
  • weight1
    • The weight applied to the first image's encoding, influencing the prominence of its features in the generated embeddings.
    • Comfy dtype: FLOAT
    • Python dtype: float
  • mask2
    • An optional mask for the second image, similar in purpose to mask1.
    • Comfy dtype: MASK
    • Python dtype: str
  • weight2
    • The weight applied to the second image's encoding.
    • Comfy dtype: FLOAT
    • Python dtype: float
  • mask3
    • An optional mask for the third image, following the same concept as the previous masks.
    • Comfy dtype: MASK
    • Python dtype: str
  • weight3
    • The weight applied to the third image's encoding.
    • Comfy dtype: FLOAT
    • Python dtype: float
  • mask4
    • An optional mask for the fourth image, used if num_embeds is four.
    • Comfy dtype: MASK
    • Python dtype: str
  • weight4
    • The weight applied to the fourth image's encoding.
    • Comfy dtype: FLOAT
    • Python dtype: float
  • combine_method
    • The method used to combine the generated embeddings, affecting the final output.
    • Comfy dtype: COMBO[STRING]
    • Python dtype: str
  • optional_ipadapter
    • An optional IPAdapter parameter that can be used to modify the encoding process.
    • Comfy dtype: IPADAPTER
    • Python dtype: str
  • pos_embeds
    • Collects the positive embeddings generated by the IPAdapter for each image. These embeddings highlight features or aspects of the images that should be emphasized.
    • Comfy dtype: EMBEDS
    • Python dtype: list
  • neg_embeds
    • Collects the negative embeddings generated by the IPAdapter for each image. These embeddings represent features or aspects of the images that should be suppressed.
    • Comfy dtype: EMBEDS
    • Python dtype: list

Output types

  • model
    • Comfy dtype: MODEL
    • Returns the model after applying the IPAdapter encoding process, potentially modified with new embeddings.
    • Python dtype: str
  • clip_vision
    • Comfy dtype: CLIP_VISION
    • Returns the clip_vision parameter, potentially modified or utilized during the encoding process.
    • Python dtype: str
  • ipadapter
    • Comfy dtype: IPADAPTER
    • Returns the IPAdapter instance used for encoding, reflecting any changes or adjustments made during the process.
    • Python dtype: str
  • pos_embed
    • Comfy dtype: EMBEDS
    • unknown
    • Python dtype: unknown
  • neg_embed
    • Comfy dtype: EMBEDS
    • unknown
    • Python dtype: unknown

Usage tips

  • Infra type: CPU
  • Common nodes: unknown

Source code

class ipadapterApplyEncoder(ipadapter):
    def __init__(self):
        super().__init__()
        pass

    @classmethod
    def INPUT_TYPES(cls):
        ipa_cls = cls()
        normal_presets = ipa_cls.normal_presets
        max_embeds_num = 4
        inputs = {
            "required": {
                "model": ("MODEL",),
                "clip_vision": ("CLIP_VISION",),
                "image1": ("IMAGE",),
                "preset": (normal_presets,),
                "num_embeds":  ("INT", {"default": 2, "min": 1, "max": max_embeds_num}),
            },
            "optional": {}
        }

        for i in range(1, max_embeds_num + 1):
            if i > 1:
                inputs["optional"][f"image{i}"] = ("IMAGE",)
        for i in range(1, max_embeds_num + 1):
            inputs["optional"][f"mask{i}"] = ("MASK",)
            inputs["optional"][f"weight{i}"] = ("FLOAT", {"default": 1.0, "min": -1, "max": 3, "step": 0.05})
        inputs["optional"]["combine_method"] = (["concat", "add", "subtract", "average", "norm average", "max", "min"],)
        inputs["optional"]["optional_ipadapter"] = ("IPADAPTER",)
        inputs["optional"]["pos_embeds"] = ("EMBEDS",)
        inputs["optional"]["neg_embeds"] = ("EMBEDS",)
        return inputs

    RETURN_TYPES = ("MODEL", "CLIP_VISION","IPADAPTER", "EMBEDS", "EMBEDS", )
    RETURN_NAMES = ("model", "clip_vision","ipadapter", "pos_embed", "neg_embed",)
    CATEGORY = "EasyUse/Adapter"
    FUNCTION = "apply"

    def batch(self, embeds, method):
        if method == 'concat' and len(embeds) == 1:
            return (embeds[0],)

        embeds = [embed for embed in embeds if embed is not None]
        embeds = torch.cat(embeds, dim=0)

        match method:
            case "add":
                embeds = torch.sum(embeds, dim=0).unsqueeze(0)
            case "subtract":
                embeds = embeds[0] - torch.mean(embeds[1:], dim=0)
                embeds = embeds.unsqueeze(0)
            case "average":
                embeds = torch.mean(embeds, dim=0).unsqueeze(0)
            case "norm average":
                embeds = torch.mean(embeds / torch.norm(embeds, dim=0, keepdim=True), dim=0).unsqueeze(0)
            case "max":
                embeds = torch.max(embeds, dim=0).values.unsqueeze(0)
            case "min":
                embeds = torch.min(embeds, dim=0).values.unsqueeze(0)

        return embeds

    def apply(self, **kwargs):
        model = kwargs['model']
        clip_vision = kwargs['clip_vision']
        preset = kwargs['preset']
        if 'optional_ipadapter' in kwargs:
            ipadapter = kwargs['optional_ipadapter']
        else:
            model, ipadapter = self.load_model(model, preset, 0, 'CPU', clip_vision=clip_vision, optional_ipadapter=None, cache_mode='none')

        if "IPAdapterEncoder" not in ALL_NODE_CLASS_MAPPINGS:
            self.error()
        encoder_cls = ALL_NODE_CLASS_MAPPINGS["IPAdapterEncoder"]
        pos_embeds = kwargs["pos_embeds"] if "pos_embeds" in kwargs else []
        neg_embeds = kwargs["neg_embeds"] if "neg_embeds" in kwargs else []
        for i in range(1, kwargs['num_embeds'] + 1):
            if f"image{i}" not in kwargs:
                raise Exception(f"image{i} is required")
            kwargs[f"mask{i}"] = kwargs[f"mask{i}"] if f"mask{i}" in kwargs else None
            kwargs[f"weight{i}"] = kwargs[f"weight{i}"] if f"weight{i}" in kwargs else 1.0

            pos, neg = encoder_cls().encode(ipadapter, kwargs[f"image{i}"], kwargs[f"weight{i}"], kwargs[f"mask{i}"], clip_vision=clip_vision)
            pos_embeds.append(pos)
            neg_embeds.append(neg)

        pos_embeds = self.batch(pos_embeds, kwargs['combine_method'])
        neg_embeds = self.batch(neg_embeds, kwargs['combine_method'])

        return (model,clip_vision, ipadapter, pos_embeds, neg_embeds)