StableZero123_Conditioning_Batched¶
Documentation¶
- Class name:
StableZero123_Conditioning_Batched
- Category:
conditioning/3d_models
- Output node:
False
This node is designed to process conditioning data in batches for the StableZero123 model, optimizing the conditioning process for efficiency and scalability. It focuses on handling multiple conditioning inputs simultaneously, applying model-specific adjustments to prepare them for the StableZero123 model's requirements.
Input types¶
Required¶
clip_vision
- Specifies the CLIP vision model to be used for conditioning, affecting how input images are interpreted and processed.
- Comfy dtype:
CLIP_VISION
- Python dtype:
torch.Tensor
init_image
- The initial image to start the generation process, serving as a base for further modifications.
- Comfy dtype:
IMAGE
- Python dtype:
torch.Tensor
vae
- The variational autoencoder used for encoding and decoding images, integral to the image transformation process.
- Comfy dtype:
VAE
- Python dtype:
torch.nn.Module
width
- The desired width of the output image, influencing the dimensionality of the generated image.
- Comfy dtype:
INT
- Python dtype:
int
height
- The desired height of the output image, influencing the dimensionality of the generated image.
- Comfy dtype:
INT
- Python dtype:
int
batch_size
- The number of images to process in a single batch, affecting the efficiency and speed of the conditioning process.
- Comfy dtype:
INT
- Python dtype:
int
elevation
- The elevation angle for 3D model viewing, affecting the perspective from which the model is rendered.
- Comfy dtype:
FLOAT
- Python dtype:
float
azimuth
- The azimuth angle for 3D model viewing, affecting the orientation of the model in the rendered image.
- Comfy dtype:
FLOAT
- Python dtype:
float
elevation_batch_increment
- The incremental change in elevation angle across the batch, allowing for varied perspectives in a single batch.
- Comfy dtype:
FLOAT
- Python dtype:
float
azimuth_batch_increment
- The incremental change in azimuth angle across the batch, allowing for varied orientations in a single batch.
- Comfy dtype:
FLOAT
- Python dtype:
float
Output types¶
positive
- Comfy dtype:
CONDITIONING
- The positive conditioning output, tailored for promoting certain features or aspects in the generated image.
- Python dtype:
List[torch.Tensor]
- Comfy dtype:
negative
- Comfy dtype:
CONDITIONING
- The negative conditioning output, tailored for suppressing certain features or aspects in the generated image.
- Python dtype:
List[torch.Tensor]
- Comfy dtype:
latent
- Comfy dtype:
LATENT
- The latent representation of the image, used for further processing or generation steps.
- Python dtype:
Dict[str, torch.Tensor]
- Comfy dtype:
Usage tips¶
- Infra type:
GPU
- Common nodes: unknown
Source code¶
class StableZero123_Conditioning_Batched:
@classmethod
def INPUT_TYPES(s):
return {"required": { "clip_vision": ("CLIP_VISION",),
"init_image": ("IMAGE",),
"vae": ("VAE",),
"width": ("INT", {"default": 256, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 8}),
"height": ("INT", {"default": 256, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 8}),
"batch_size": ("INT", {"default": 1, "min": 1, "max": 4096}),
"elevation": ("FLOAT", {"default": 0.0, "min": -180.0, "max": 180.0, "step": 0.1, "round": False}),
"azimuth": ("FLOAT", {"default": 0.0, "min": -180.0, "max": 180.0, "step": 0.1, "round": False}),
"elevation_batch_increment": ("FLOAT", {"default": 0.0, "min": -180.0, "max": 180.0, "step": 0.1, "round": False}),
"azimuth_batch_increment": ("FLOAT", {"default": 0.0, "min": -180.0, "max": 180.0, "step": 0.1, "round": False}),
}}
RETURN_TYPES = ("CONDITIONING", "CONDITIONING", "LATENT")
RETURN_NAMES = ("positive", "negative", "latent")
FUNCTION = "encode"
CATEGORY = "conditioning/3d_models"
def encode(self, clip_vision, init_image, vae, width, height, batch_size, elevation, azimuth, elevation_batch_increment, azimuth_batch_increment):
output = clip_vision.encode_image(init_image)
pooled = output.image_embeds.unsqueeze(0)
pixels = comfy.utils.common_upscale(init_image.movedim(-1,1), width, height, "bilinear", "center").movedim(1,-1)
encode_pixels = pixels[:,:,:,:3]
t = vae.encode(encode_pixels)
cam_embeds = []
for i in range(batch_size):
cam_embeds.append(camera_embeddings(elevation, azimuth))
elevation += elevation_batch_increment
azimuth += azimuth_batch_increment
cam_embeds = torch.cat(cam_embeds, dim=0)
cond = torch.cat([comfy.utils.repeat_to_batch_size(pooled, batch_size), cam_embeds], dim=-1)
positive = [[cond, {"concat_latent_image": t}]]
negative = [[torch.zeros_like(pooled), {"concat_latent_image": torch.zeros_like(t)}]]
latent = torch.zeros([batch_size, 4, height // 8, width // 8])
return (positive, negative, {"samples":latent, "batch_index": [0] * batch_size})