Nodes Browser
ComfyDeploy: How GeminiOllama ComfyUI Extension works in ComfyUI?
What is GeminiOllama ComfyUI Extension?
This extension integrates Google's Gemini API and Ollama into ComfyUI, allowing users to leverage these powerful language models directly within their ComfyUI workflows.
How to install it in ComfyDeploy?
Head over to the machine page
- Click on the "Create a new machine" button
- Select the
Edit
build steps - Add a new step -> Custom Node
- Search for
GeminiOllama ComfyUI Extension
and select it - Close the build step dialig and then click on the "Save" button to rebuild the machine
ComfyUI GeminiOllama Extension
This extension integrates Google's Gemini API, OpenAI (ChatGPT), Anthropic's Claude, Ollama, Qwen, and various image processing tools into ComfyUI, allowing users to leverage these powerful models and features directly within their ComfyUI workflows.
Features
- Support for multiple AI APIs:
- Google Gemini
- OpenAI (ChatGPT)
- Anthropic Claude
- Ollama
- Alibaba Qwen
- Text and image input capabilities
- Streaming option for real-time responses
- FLUX Resolution tools for image sizing
- ComfyUI Styler for advanced styling options
- Raster to Vector (SVG) conversion
- Text splitting and processing
- Easy integration with ComfyUI workflows
Nodes
1. Gemini API
- Models:
- gemini-2.0-pro
- gemini-2.0-flash
- gemini-2.0-flash-lite-preview-02-05
- gemini-2.0-pro-experimental-02-05
- gemini-1.5-pro
- gemini-1.5-flash-8b
- gemini-1.5-pro-experimental
- learnlm-1.5-pro-experimental
2. OpenAI API
- Models:
- gpt-4o-mini
- gpt-3.5-turbo
- gpt-3.5-turbo-0125
- gpt-3.5-turbo-16k
- gpt-3.5-turbo-1106
- o1-preview/mini
- deepseek-ai/deepseek-r1
3. Claude API
Access Anthropic's Claude models for advanced language tasks:
- Text input field for prompts
- Model selection:
- claude-3-opus
- claude-3-sonnet
- claude-3-haiku
- Temperature control
- System prompt configuration
- Streaming capability
4. Ollama API
Integrate local language models running via Ollama:
- Text input field for prompts
- Dropdown for selecting Ollama models
- Customizable model options
5. Qwen API
Access Alibaba's Qwen language models:
- Text input field for prompts
- Model selection:
- qwen-turbo
- qwen-plus
- qwen-max
- Temperature control
- Streaming capability
6. FLUX Resolutions
Provides advanced image resolution and sizing options:
- Predefined resolution presets (e.g., 768x1024, 1024x768, 1152x768)
- Custom sizing parameters:
- size_selected
- multiply_factor
- manual_width
- manual_height
7. ComfyUI Styler
Extensive styling options for various creative needs:
šØ General Arts ā A broad spectrum of traditional and modern art styles šø Anime ā Bring your designs to life with anime-inspired aesthetics šØ Artist ā Channel the influence of world-class artists š· Camera ā Fine-tune focal lengths, angles, and setups š Camera Angles ā Add dynamic perspectives with a range of angles š Aesthetic ā Define unique artistic vibes and styles šļø Color Grading ā Achieve rich cinematic tones and palettes š¬ Movies ā Get inspired by different cinematic worlds šļø Digital Artform ā From vector art to abstract digital styles šŖ Body Type ā Customize different body shapes and dimensions š² Reactions ā Capture authentic emotional expressions š Feelings ā Set the emotional tone for each creation šø Photographers ā Infuse the style of renowned photographers š Hair Style ā Wide variety of hair designs for your characters šļø Architecture Style ā Classical to modern architectural themes š ļø Architect ā Designs inspired by notable architects š Vehicle ā Add cars, planes, or futuristic transportation šŗ Poses ā Customize dynamic body positions š¬ Science ā Add futuristic, scientific elements š Clothing State ā Define the wear and tear of clothing š Clothing Style ā Wide range of fashion styles šØ Composition ā Control the layout and arrangement of elements š Depth ā Add dimensionality and focus to your scenes š Environment ā From nature to urban settings, create rich backdrops š Face ā Customize facial expressions and emotions š¦ Fantasy ā Bring magical and surreal elements into your visuals š Filter ā Apply unique visual filters for artistic effects š¤ Gothic ā Channel dark, mysterious, and dramatic themes š» Halloween ā Get spooky with Halloween-inspired designs āļø Line Art ā Incorporate clean, bold lines into your creations š” Lighting ā Set the mood with dramatic lighting effects āļø Milehigh ā Capture the essence of aviation and travel š Mood ā Set the emotional tone and atmosphere šļø Movie Poster ā Create dramatic, story-driven poster designs šø Punk ā Channel bold, rebellious aesthetics š Travel Poster ā Design vintage travel posters with global vibes
8. Raster to Vector (SVG) and Save SVG
Convert raster images to vector graphics and save them:
Raster to Vector node parameters:
- colormode
- filter_speckle
- corner_threshold
- ... (and more)
Save SVG node options:
- filename_prefix
- overwrite_existing
9. TextSplitByDelimiter
Split text based on specified delimiters:
- Input text field
- Delimiter options:
- split_regex
- split_every
- split_count
Installation
-
Clone this repository into your ComfyUI's
custom_nodes
directory:cd /path/to/ComfyUI/custom_nodes git clone https://github.com/yourusername/GeminiOllama.git
-
Install the required dependencies:
pip install google-generativeai openai anthropic requests vtracer
Configuration
API Key Setup
Edit config.json
: with your fav AI provider
{
"GEMINI_API_KEY": "your_gemini_api_key",
"OPENAI_API_KEY": "your_openai_api_key",
"ANTHROPIC_API_KEY": "your_claude_api_key",
"OLLAMA_URL": "http://localhost:11434",
"QWEN_API_KEY": "your_qwen_api_key"
}
- Obtain API keys from:
- Gemini: Google AI Studio
- OpenAI: OpenAI Platform
- Claude: Anthropic Console
- Qwen: DashScope Console
Usage
After installation and configuration, new nodes for each API will be available in ComfyUI.
Input Parameters
api_choice
: Choose between "Gemini", "OpenAI", "Claude", and "Ollama"prompt
: The text prompt for the AI modelmodel_selection
: Select the specific model for chosen APItemperature
: Control response randomness (OpenAI and Claude)system_message
: Set system behavior (OpenAI and Claude)stream
: Enable/disable streaming responsesimage
(optional): Input image for vision-based tasks
Output
text
: The generated response from the chosen AI model
Main Functions
get_api_keys()
: Retrieves API keys from the config fileget_ollama_url()
: Gets the Ollama URL from the config filegenerate_content()
: Main function to generate content based on the chosen API and parametersgenerate_gemini_content()
: Handles content generation for Gemini APIgenerate_openai_content()
: Manages content generation for OpenAI APIgenerate_claude_content()
: Handles content generation for Claude APIgenerate_ollama_content()
: Manages content generation for Ollama APItensor_to_image()
: Converts a tensor to a PIL Image for vision-based tasks
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.