LayerUtility: QWenImage2Prompt¶
Documentation¶
- Class name:
LayerUtility: QWenImage2Prompt
- Category:
😺dzNodes/LayerUtility/Prompt
- Output node:
False
This node integrates a chat model to generate text prompts based on an input image and a question. It utilizes an image-to-text conversion process to facilitate interactions with the chat model, aiming to produce descriptive or query-responsive text outputs.
Input types¶
Required¶
image
- The input image for which the text prompt will be generated. It serves as the visual context for the chat model to interpret and describe.
- Comfy dtype:
IMAGE
- Python dtype:
torch.Tensor
question
- An optional question to guide the generation of the text prompt, allowing for more specific or directed outputs from the chat model.
- Comfy dtype:
STRING
- Python dtype:
str
Output types¶
text
- Comfy dtype:
STRING
- The generated text prompt that describes or answers the question about the input image.
- Python dtype:
str
- Comfy dtype:
Usage tips¶
- Infra type:
GPU
- Common nodes: unknown
Source code¶
class QWenImage2Prompt:
def __init__(self):
self.chat_model = UformGen2QwenChat()
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"image": ("IMAGE",),
"question": ("STRING", {"multiline": False, "default": "describe this image",},),
},
}
RETURN_TYPES = ("STRING",)
RETURN_NAMES = ("text",)
FUNCTION = "uform_gen2_qwen_chat"
CATEGORY = '😺dzNodes/LayerUtility/Prompt'
def uform_gen2_qwen_chat(self, image, question):
history = [] # Example empty history
pil_image = ToPILImage()(image[0].permute(2, 0, 1))
temp_path = files_for_uform_gen2_qwen / "temp.png"
pil_image.save(temp_path)
response = self.chat_model.chat_response(question, history, temp_path)
return (response.split("assistant\n", 1)[1], )