  • Class name: Superprompt
  • Category: KJNodes/text
  • Output node: False

Superprompt is designed to enhance text prompts by upsampling them into more detailed descriptions using a T5 model fine-tuned on the SuperPrompt dataset. It serves as a pre-generation step to enrich prompts for text-to-image models, facilitating the creation of more detailed and accurate images.

Input types


  • instruction_prompt
    • A guiding instruction to shape the expansion of the prompt, setting the context for how the text should be detailed.
    • Comfy dtype: STRING
    • Python dtype: str
  • prompt
    • The initial text prompt to be expanded into a more detailed description.
    • Comfy dtype: STRING
    • Python dtype: str
  • max_new_tokens
    • Specifies the maximum number of new tokens to be generated, controlling the length of the expanded description.
    • Comfy dtype: INT
    • Python dtype: int

Output types

  • string
    • Comfy dtype: STRING
    • The enhanced, detailed version of the original text prompt.
    • Python dtype: str

Usage tips

  • Infra type: GPU
  • Common nodes: unknown

Source code

class Superprompt:
    def INPUT_TYPES(s):
        return {
            "required": {
                "instruction_prompt": ("STRING", {"default": 'Expand the following prompt to add more detail', "multiline": True}),
                "prompt": ("STRING", {"default": '', "multiline": True, "forceInput": True}),
                "max_new_tokens": ("INT", {"default": 128, "min": 1, "max": 4096, "step": 1}),

    FUNCTION = "process"
    CATEGORY = "KJNodes/text"
    def process(self, instruction_prompt, prompt, max_new_tokens):
        device = model_management.get_torch_device()
        from transformers import T5Tokenizer, T5ForConditionalGeneration

        checkpoint_path = os.path.join(script_directory, "models","superprompt-v1")
        if not os.path.exists(checkpoint_path):
                print(f"Downloading model to: {checkpoint_path}")
                from huggingface_hub import snapshot_download
        tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-small", legacy=False)

        model = T5ForConditionalGeneration.from_pretrained(checkpoint_path, device_map=device)
        input_text = instruction_prompt + ": " + prompt

        input_ids = tokenizer(input_text, return_tensors="pt")
        outputs = model.generate(input_ids,  max_new_tokens=max_new_tokens)
        out = (tokenizer.decode(outputs[0]))
        out = out.replace('<pad>', '')
        out = out.replace('</s>', '')

        return (out, )