Nodes Browser

ComfyDeploy: How ComfyUI-Image-Captioner works in ComfyUI?

What is ComfyUI-Image-Captioner?

A ComfyUI extension for generating captions for your images. Runs on your own system, no external services used, no filter. Uses various VLMs with APIs to generate captions for images. You can give instructions or ask questions in natural language.

How to install it in ComfyDeploy?

Head over to the machine page

Click on the "Create a new machine" button
Select the Edit build steps
Add a new step -> Custom Node
Search for ComfyUI-Image-Captioner and select it
Close the build step dialig and then click on the "Save" button to rebuild the machine

ComfyUI ImageCaptioner

<div align="center"> <img src="assets/icon.png" style="width: 20%;" /> </div>

A ComfyUI extension for generating captions for your images. Runs on your own system, no external services used, no filter.

Uses various VLMs with APIs to generate captions for images. You can give instructions or ask questions in natural language.

Try asking for:

captions or long descriptions
whether a person or object is in the image, and how many
lists of keywords or tags
a description of the opposite of the image

workflow

Installation

git clone https://github.com/neverbiasu/ComfyUI-ImageCaptioner into your custom_nodes folder
- e.g. custom_nodes\ComfyUI-ImageCaptioner
Open a console/Command Prompt/Terminal etc
Change to the custom_nodes/ComfyUI-ImageCaptioner folder you just created
- e.g. cd C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-ImageCaptioner or wherever you have it installed
Run pip install -r requirements.txt

Usage

Add the node via image -> ImageCaptioner

Supports tagging and outputting multiple batched inputs.

image: The image you want to make captions.
api: The API of dashscope.
use_prompt: The prompt to drive the VLMs.

Requirements

U need to get the API of dashscope from the document

See also