Skip to content

DownloadAndLoadFlorence2Model

Documentation

  • Class name: DownloadAndLoadFlorence2Model
  • Category: Florence2
  • Output node: False

This node is designed to download and load a specified Florence2 model, applying the chosen precision and attention mechanism settings. It ensures the model is ready for use by initializing it with the appropriate device, data type, and attention implementation.

Input types

Required

  • model
    • Specifies the repository ID of the Florence2 model to be downloaded and loaded. It influences the model's capabilities and features.
    • Comfy dtype: COMBO[STRING]
    • Python dtype: str
  • precision
    • Determines the numerical precision (e.g., bf16, fp16, fp32) of the model's computations, affecting performance and memory usage.
    • Comfy dtype: COMBO[STRING]
    • Python dtype: str
  • attention
    • Selects the attention mechanism implementation to be used by the model, impacting its performance and accuracy.
    • Comfy dtype: COMBO[STRING]
    • Python dtype: str

Output types

  • florence2_model
    • Comfy dtype: FL2MODEL
    • A dictionary containing the loaded model, its processor, and the data type used for computations.
    • Python dtype: Dict[str, Any]

Usage tips

  • Infra type: GPU
  • Common nodes: unknown

Source code

class DownloadAndLoadFlorence2Model:
    @classmethod
    def INPUT_TYPES(s):
        return {"required": {
            "model": (
                    [ 
                    'microsoft/Florence-2-base',
                    'microsoft/Florence-2-base-ft',
                    'microsoft/Florence-2-large',
                    'microsoft/Florence-2-large-ft',
                    'HuggingFaceM4/Florence-2-DocVQA'
                    ],
                    {
                    "default": 'microsoft/Florence-2-base'
                    }),
            "precision": ([ 'fp16','bf16','fp32'],
                    {
                    "default": 'fp16'
                    }),
            "attention": (
                    [ 'flash_attention_2', 'sdpa', 'eager'],
                    {
                    "default": 'sdpa'
                    }),

            },
        }

    RETURN_TYPES = ("FL2MODEL",)
    RETURN_NAMES = ("florence2_model",)
    FUNCTION = "loadmodel"
    CATEGORY = "Florence2"

    def loadmodel(self, model, precision, attention):
        device = mm.get_torch_device()
        offload_device = mm.unet_offload_device()
        dtype = {"bf16": torch.bfloat16, "fp16": torch.float16, "fp32": torch.float32}[precision]

        model_name = model.rsplit('/', 1)[-1]
        model_path = os.path.join(folder_paths.models_dir, "LLM", model_name)

        if not os.path.exists(model_path):
            print(f"Downloading Lumina model to: {model_path}")
            from huggingface_hub import snapshot_download
            snapshot_download(repo_id=model,
                            local_dir=model_path,
                            local_dir_use_symlinks=False)

        print(f"using {attention} for attention")
        with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports): #workaround for unnecessary flash_attn requirement
            model = AutoModelForCausalLM.from_pretrained(model_path, attn_implementation=attention, device_map=device, torch_dtype=dtype,trust_remote_code=True)
        processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)

        florence2_model = {
            'model': model, 
            'processor': processor,
            'dtype': dtype
            }

        return (florence2_model,)