What is ComfyUI-AutoLabel?

ComfyUI-AutoLabel is a custom node for ComfyUI that uses BLIP (Bootstrapping Language-Image Pre-training) to generate detailed descriptions of the main object in an image. This node leverages the power of BLIP to provide accurate and context-aware captions for images.

How to install it in ComfyDeploy?

Head over to the machine page

  1. Click on the "Create a new machine" button
  2. Select the Edit build steps
  3. Add a new step -> Custom Node
  4. Search for ComfyUI-AutoLabel and select it
  5. Close the build step dialig and then click on the "Save" button to rebuild the machine


  • Image to Text Description: Generate detailed descriptions of the main object in an image.
  • Customizable Prompts: Provide your own prompt to guide the description generation.
  • Flexible Inference Modes: Supports GPU, GPU with float16, and CPU inference modes.
  • Offline Mode: Option to download and use models offline.


  1. Clone the Repository: Clone this repository into your custom_nodes folder in ComfyUI.

    git clone custom_nodes/ComfyUI-AutoLabel
  2. Install Dependencies: Navigate to the cloned folder and install the required dependencies.

    cd custom_nodes/ComfyUI-AutoLabel
    pip install -r requirements.txt


Adding the Node

  1. Start ComfyUI.
  2. Add the AutoLabel node from the custom nodes list.
  3. Connect an image input and configure the parameters as needed.


  • image (required): The input image tensor.
  • prompt (optional): A string to guide the description generation (default: "a photography of").
  • repo_id (optional): The Hugging Face model repository ID (default: "Salesforce/blip-image-captioning-base").
  • inference_mode (optional): The inference mode, can be "gpu_float16", "gpu", or "cpu" (default: "gpu").
  • get_model_online (optional): Boolean flag to download the model online if not already present (default: True).


