ComfyUI-Florence2¶

ComfyUI-Florence2 is an advanced vision foundation model designed for a wide range of vision and vision-language tasks, capable of interpreting text prompts for captioning, object detection, and segmentation using its FLD-5B dataset. It introduces a new feature for Document Visual Question Answering (DocVQA), enabling users to extract information from document images by asking questions and receiving answers based on the visual and textual content of the documents.

Tags¶

Loader

Repo info¶

Repo url: https://github.com/kijai/ComfyUI-Florence2
Commit hash: 9425f1c00ce01bbb60d6175933b8f74443edc41d

Licenses¶

MIT: LICENSE
MIT: pyproject.toml

Visit licenses page for the details