Nodes Browser

ComfyDeploy: How ComfyUI_MiniCPM-V-2_6-int4 works in ComfyUI?

What is ComfyUI_MiniCPM-V-2_6-int4?

This is an implementation of [a/MiniCPM-V-2_6-int4](https://github.com/OpenBMB/MiniCPM-V) by [a/ComfyUI](https://github.com/comfyanonymous/ComfyUI), including support for text-based queries, video queries, single-image queries, and multi-image queries to generate captions or responses.

How to install it in ComfyDeploy?

Head over to the machine page

  1. Click on the "Create a new machine" button
  2. Select the Edit build steps
  3. Add a new step -> Custom Node
  4. Search for ComfyUI_MiniCPM-V-2_6-int4 and select it
  5. Close the build step dialig and then click on the "Save" button to rebuild the machine

ComfyUI_MiniCPM-V-2_6-int4

This is an implementation of MiniCPM-V-2_6-int4 by ComfyUI, including support for text-based queries, video queries, single-image queries, and multi-image queries to generate captions or responses.


Recent Updates

  • Added keep_model_loaded parameter

By default, this parameter is set to False, which indicates that the model will be unloaded from GPU memory after each prediction is made.

However, if set to True, the model will remain loaded in GPU memory. This is particularly useful when multiple predictions with the same model are needed, eliminating the need to reload it between uses.

  • Added seed parameter

This parameter enables the setting of a random seed for the purpose of ensuring reproducibility in results.


Basic Workflow

  • Text-based Query: Users can submit textual queries to request information or generate descriptions. For instance, a user might input a description like "What is the meaning of life?"

<span style="color: green;">Chat_with_text_workflow_legacy preview</span> Chat_with_text_workflow_legacy preview <span style="color: green;">Chat_with_text_workflow_polished preview</span> Chat_with_text_workflow_polished preview

  • Video Query: When a user uploads a video, the system can analyze the content and generate a detailed caption for each frame or a summary of the entire video. For example, "Generate a caption for the given video."

<span style="color: green;">Chat_with_video_workflow_legacy preview</span> Chat_with_video_workflow_legacy preview <span style="color: green;">Chat_with_video_workflow_polished preview</span> Chat_with_video_workflow_polished preview

  • Single-Image Query: This workflow supports generating a caption for an individual image. A user could upload a photo and ask, "What does this image show?" resulting in a caption such as "A majestic lion pride relaxing on the savannah."

<span style="color: green;">Chat_with_single_image_workflow_legacy preview</span> Chat_with_single_image_workflow_legacy preview <span style="color: green;">Chat_with_single_image_workflow_polished preview</span> Chat_with_single_image_workflow_polished preview

  • Multi-Image Query: For multiple images, the system can provide a collective description or a narrative that ties the images together. For example, "Create a story from the following series of images: one of a couple at a beach, another at a wedding ceremony, and the last one at a baby's christening."

<span style="color: green;">Chat_with_multiple_images_workflow_legacy preview</span> Chat_with_multiple_images_workflow_legacy preview <span style="color: green;">Chat_with_multiple_images_workflow_polished preview</span> Chat_with_multiple_images_workflow_polished preview

Installation

  • Install from ComfyUI Manager (search for minicpm)

  • Download or git clone this repository into the ComfyUI\custom_nodes\ directory and run:

pip install -r requirements.txt

Download Models

All the models will be downloaded automatically when running the workflow if they are not found in the ComfyUI\models\prompt_generator\ directory.