Nodes Browser

ComfyDeploy: How ComfyUI-FapMixPlus works in ComfyUI?

What is ComfyUI-FapMixPlus?

This is an audio processing script that applies soft limiting, optional loudness normalization, and optional slicing for transcription. It can also produce stereo-mixed outputs with optional audio appended to the end. The script organizes processed files into structured folders with sanitized filenames and retains original timestamps for continuity.

How to install it in ComfyDeploy?

Head over to the machine page

  1. Click on the "Create a new machine" button
  2. Select the Edit build steps
  3. Add a new step -> Custom Node
  4. Search for ComfyUI-FapMixPlus and select it
  5. Close the build step dialig and then click on the "Save" button to rebuild the machine

preFapMix.py

preFapMix.py is an audio processing script that applies soft limiting, optional loudness normalization, and optional slicing for transcription. It can also produce stereo-mixed outputs with optional audio appended to the end. The script organizes processed files into structured folders with sanitized filenames and retains original timestamps for continuity.

Features

  1. Soft Limiting: Reduces loud peaks in audio to prevent clipping.
  2. Optional Loudness Normalization: Adjusts audio levels to achieve consistent loudness.
  3. Conditional Slicing and Transcription: Options to slice and transcribe files in the left or right channels separately, or both channels together.
  4. Stereo Mixing with Optional Tone Appending: Optionally appends a custom tone (tones.wav) to the end of stereo-mixed audio.
  5. Organized Output Structure: Outputs are saved in structured folders with sanitized filenames.
  6. Timestamp Preservation: Maintains the original timestamps for all output files.

Installation Requirements

  • Python 3.x
  • Pydub for audio processing
    pip install pydub
    
  • FFmpeg: Required by Pydub for handling audio files
    sudo apt-get install ffmpeg
    
  • fap: The transcription tool, assumed to be installed and accessible via the command line.

Usage

Command Line

Run the script from the command line with the following arguments:

python preFapMix.py --input-dir <input_directory> --output-dir <output_directory> [options]

Options

  • --input-dir: Directory containing input audio files (required).
  • --output-dir: Directory where processed files will be saved (required).
  • --transcribe: Enables transcription for both left and right channels. Implies both --transcribe_left and --transcribe_right.
  • --transcribe_left: Enables transcription only for the left channel.
  • --transcribe_right: Enables transcription only for the right channel.
  • --normalize: Enables loudness normalization on the audio.
  • --tones: Appends the contents of tones.wav to the end of each stereo output file.
  • --num-workers: Specifies the number of workers to use for transcription (default is 2).

Workflow

  1. Pre-Processing:

    • Applies a soft limiter at -6 dB to control peaks.
    • If --normalize is enabled, normalizes loudness to -23 LUFS for consistency.
  2. Conditional Slicing and Transcription:

    • If --transcribe is enabled, slices audio files to smaller segments and transcribes each segment, generating .lab files.
    • With --transcribe_left or --transcribe_right, transcribes only files in the left or right folders, respectively.
  3. Stereo Mixing with Optional Tone Appending:

    • Combines left and right channels into a stereo file.
    • If --tones is enabled, appends tones.wav to the end of each stereo file.
  4. File Naming and Organization:

    • Names each sliced audio file with its original numeric name, followed by the first 12 words (or fewer) from its .lab file.
    • All filenames are sanitized for UTF-8 compliance.

Output Structure

The output structure is organized within <output_directory>/run_<timestamp> as follows:

  • normalized/: Contains normalized versions of the input audio files.
  • left/ and right/: Contains sliced (and optionally transcribed) audio files in respective left and right channel folders.
  • stereo/: Contains stereo-mixed files with optional tone appended to the end.
  • transcribed-and-sliced/:
    • Root: Contains combined .lab files for each original input.
    • left/ and right/: Contains subfolders of sliced audio files and corresponding .lab files.

Example Command

python preFapMix.py --input-dir ./my_audio_files --output-dir ./processed_audio --transcribe --normalize --tones --num-workers 3

This command will:

  1. Process the audio files in ./my_audio_files with soft limiting and loudness normalization.
  2. Slice and transcribe each file in the left and right channels.
  3. Mix each pair of left and right channels into a stereo file and append tones.wav to the end of each stereo output.

fapMixPlus

This project provides an end-to-end audio processing pipeline to automate the extraction, separation, slicing, transcription, and renaming of audio files. The resulting files are saved in a structured output directory with cleaned filenames and optional ZIP archives for easier distribution or storage.

Features

  • Download Audio: Fetches audio files from a URL or uses local input files.
  • Convert to WAV: Converts audio files to WAV format.
  • Separate Vocals: Isolates vocal tracks from the WAV files.
  • Slice Audio: Segments the separated vocal track for transcription.
  • Transcribe: Generates transcriptions from audio slices.
  • Sanitize and Rename Files: Creates sanitized filenames with a numerical prefix, limited to 128 characters.
  • Generate ZIP Files: Compresses processed files into ZIP archives for easy storage and distribution.

Prerequisites

  • Python 3.x
  • Install required Python packages:
    pip install yt-dlp
    
  • Fish Audio Preprocessor (fap) should be installed and available in the PATH.

Installing the Fish Audio Preprocessor (fap)

  1. Clone the Fish Audio Preprocessor repository:

    git clone https://github.com/fishaudio/audio-preprocess.git
    
  2. Navigate to the repository directory:

    cd audio-preprocess
    
  3. Install the package from the cloned repository:

    pip install -e .
    

This step installs fap and makes it accessible as a command-line tool, which is essential for fapMixPlus.py to function correctly.

  1. Verify the installation by checking the version:
    fap --version
    

Usage

Command-line Arguments

| Argument | Description | |-----------------|----------------------------------------------------------------------| | --url | URL of the audio source (YouTube or other supported link). | | --output_dir | Directory for saving all outputs. Default is output/. | | input_dir | Path to a local directory of input files (optional if --url used). |

Example Command

python fapMixPlus.py --url https://youtu.be/example_video --output_dir my_output

This command will download the audio from the URL, process it, and save the results in the my_output folder.

Output Structure

The output directory will contain a timestamped folder with the following structure:

output_<timestamp>/
├── wav_conversion/            # WAV-converted audio files
├── separation_output/         # Separated vocal track files
├── slicing_output/            # Sliced segments from separated audio
├── final_output/              # Final, sanitized, and renamed .wav and .lab files
├── zip_files/                 # Compressed ZIP archives of processed files

ZIP File Details

In addition to organizing output files by processing stages, fapMixPlus can generate ZIP archives for convenience. Each ZIP file in the zip_files/ directory will contain a set of processed audio and transcription files, with names based on their content and timestamp. The ZIP filenames will follow this format:

output_<timestamp>.zip

Each ZIP file will include:

  • The WAV and .lab files from final_output/, with sanitized filenames.
  • These ZIP files are ideal for transferring or archiving processed audio.

Functionality Details

  1. Download Audio: Downloads audio from a URL, saving it in .m4a format.
  2. WAV Conversion: Converts audio to WAV using fap to-wav.
  3. Separation: Separates vocals from the WAV files using fap separate.
  4. Slicing: Segments the separated vocal track into smaller audio slices.
  5. Transcription: Uses fap transcribe to transcribe each slice.
  6. Sanitization and Renaming:
    • Extracts the first 10 words from each .lab file.
    • Replaces spaces with underscores, removes special characters, and limits to 128 characters.
    • Applies a numerical prefix if no valid content is in the .lab file.
  7. ZIP File Creation:
    • After processing, the final .wav and .lab files are compressed into ZIP archives in zip_files/ for each session, making it easy to organize or share the output.

Example File Names in Final Output

Final output files in final_output will be structured like:

  • 0001_Hello_this_is_a_sample_transcription.wav
  • 0001_Hello_this_is_a_sample_transcription.lab

Files without usable .lab content will retain the numerical prefix, e.g., 0002.wav and 0002.lab.