Nodes Browser

ComfyDeploy: How ComfyUI_OmniParser works in ComfyUI?

What is ComfyUI_OmniParser?

Try [a/OmniParser](https://github.com/microsoft/OmniParser) in ComfyUI which a simple screen parsing tool towards pure vision based GUI agent.

How to install it in ComfyDeploy?

Head over to the machine page

  1. Click on the "Create a new machine" button
  2. Select the Edit build steps
  3. Add a new step -> Custom Node
  4. Search for ComfyUI_OmniParser and select it
  5. Close the build step dialig and then click on the "Save" button to rebuild the machine

ComfyUI_OmniParser

Try OmniParser in ComfyUI which a simple screen parsing tool towards pure vision based GUI agent.


1.Installation

In the ./ComfyUI /custom_node directory, run the following:

git clone https://github.com/smthemex/ComfyUI_OmniParser.git

2.requirements

pip install -r requirements.txt


3.Checkpoints

huggingface-OmniParser


4.Example


5.Citation

microsoft/OmniParser

@misc{lu2024omniparserpurevisionbased,
      title={OmniParser for Pure Vision Based GUI Agent}, 
      author={Yadong Lu and Jianwei Yang and Yelong Shen and Ahmed Awadallah},
      year={2024},
      eprint={2408.00203},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.00203}, 
}

Some codes form # @aliencaocao