Skip to content

ComfyUI-Florence2

ComfyUI-Florence2 is an advanced vision foundation model designed for a wide range of vision and vision-language tasks, capable of interpreting text prompts for captioning, object detection, and segmentation using its FLD-5B dataset. It introduces a new feature for Document Visual Question Answering (DocVQA), enabling users to extract information from document images by asking questions and receiving answers based on the visual and textual content of the documents.

Tags

Loader

Repo info

  • Repo url: https://github.com/kijai/ComfyUI-Florence2
  • Commit hash: 9425f1c00ce01bbb60d6175933b8f74443edc41d

Licenses

  • MIT: LICENSE
  • MIT: pyproject.toml

Visit licenses page for the details