Nodes Browser
ComfyDeploy: How ComfyUI-FapMixPlus works in ComfyUI?
What is ComfyUI-FapMixPlus?
This is an audio processing script that applies soft limiting, optional loudness normalization, and optional slicing for transcription. It can also produce stereo-mixed outputs with optional audio appended to the end. The script organizes processed files into structured folders with sanitized filenames and retains original timestamps for continuity.
How to install it in ComfyDeploy?
Head over to the machine page
- Click on the "Create a new machine" button
- Select the
Edit
build steps - Add a new step -> Custom Node
- Search for
ComfyUI-FapMixPlus
and select it - Close the build step dialig and then click on the "Save" button to rebuild the machine
preFapMix.py
preFapMix.py is an audio processing script that applies soft limiting, optional loudness normalization, and optional slicing for transcription. It can also produce stereo-mixed outputs with optional audio appended to the end. The script organizes processed files into structured folders with sanitized filenames and retains original timestamps for continuity.
Features
- Soft Limiting: Reduces loud peaks in audio to prevent clipping.
- Optional Loudness Normalization: Adjusts audio levels to achieve consistent loudness.
- Conditional Slicing and Transcription: Options to slice and transcribe files in the left or right channels separately, or both channels together.
- Stereo Mixing with Optional Tone Appending: Optionally appends a custom tone (
tones.wav
) to the end of stereo-mixed audio. - Organized Output Structure: Outputs are saved in structured folders with sanitized filenames.
- Timestamp Preservation: Maintains the original timestamps for all output files.
Installation Requirements
- Python 3.x
- Pydub for audio processing
pip install pydub
- FFmpeg: Required by Pydub for handling audio files
sudo apt-get install ffmpeg
- fap: The transcription tool, assumed to be installed and accessible via the command line.
Usage
Command Line
Run the script from the command line with the following arguments:
python preFapMix.py --input-dir <input_directory> --output-dir <output_directory> [options]
Options
--input-dir
: Directory containing input audio files (required).--output-dir
: Directory where processed files will be saved (required).--transcribe
: Enables transcription for both left and right channels. Implies both--transcribe_left
and--transcribe_right
.--transcribe_left
: Enables transcription only for the left channel.--transcribe_right
: Enables transcription only for the right channel.--normalize
: Enables loudness normalization on the audio.--tones
: Appends the contents oftones.wav
to the end of each stereo output file.--num-workers
: Specifies the number of workers to use for transcription (default is 2).
Workflow
-
Pre-Processing:
- Applies a soft limiter at -6 dB to control peaks.
- If
--normalize
is enabled, normalizes loudness to -23 LUFS for consistency.
-
Conditional Slicing and Transcription:
- If
--transcribe
is enabled, slices audio files to smaller segments and transcribes each segment, generating.lab
files. - With
--transcribe_left
or--transcribe_right
, transcribes only files in the left or right folders, respectively.
- If
-
Stereo Mixing with Optional Tone Appending:
- Combines left and right channels into a stereo file.
- If
--tones
is enabled, appendstones.wav
to the end of each stereo file.
-
File Naming and Organization:
- Names each sliced audio file with its original numeric name, followed by the first 12 words (or fewer) from its
.lab
file. - All filenames are sanitized for UTF-8 compliance.
- Names each sliced audio file with its original numeric name, followed by the first 12 words (or fewer) from its
Output Structure
The output structure is organized within <output_directory>/run_<timestamp>
as follows:
normalized/
: Contains normalized versions of the input audio files.left/
andright/
: Contains sliced (and optionally transcribed) audio files in respective left and right channel folders.stereo/
: Contains stereo-mixed files with optional tone appended to the end.transcribed-and-sliced/
:- Root: Contains combined
.lab
files for each original input. left/
andright/
: Contains subfolders of sliced audio files and corresponding.lab
files.
- Root: Contains combined
Example Command
python preFapMix.py --input-dir ./my_audio_files --output-dir ./processed_audio --transcribe --normalize --tones --num-workers 3
This command will:
- Process the audio files in
./my_audio_files
with soft limiting and loudness normalization. - Slice and transcribe each file in the left and right channels.
- Mix each pair of left and right channels into a stereo file and append
tones.wav
to the end of each stereo output.
fapMixPlus
This project provides an end-to-end audio processing pipeline to automate the extraction, separation, slicing, transcription, and renaming of audio files. The resulting files are saved in a structured output directory with cleaned filenames and optional ZIP archives for easier distribution or storage.
Features
- Download Audio: Fetches audio files from a URL or uses local input files.
- Convert to WAV: Converts audio files to WAV format.
- Separate Vocals: Isolates vocal tracks from the WAV files.
- Slice Audio: Segments the separated vocal track for transcription.
- Transcribe: Generates transcriptions from audio slices.
- Sanitize and Rename Files: Creates sanitized filenames with a numerical prefix, limited to 128 characters.
- Generate ZIP Files: Compresses processed files into ZIP archives for easy storage and distribution.
Prerequisites
- Python 3.x
- Install required Python packages:
pip install yt-dlp
- Fish Audio Preprocessor (
fap
) should be installed and available in the PATH.
Installing the Fish Audio Preprocessor (fap
)
-
Clone the Fish Audio Preprocessor repository:
git clone https://github.com/fishaudio/audio-preprocess.git
-
Navigate to the repository directory:
cd audio-preprocess
-
Install the package from the cloned repository:
pip install -e .
This step installs fap
and makes it accessible as a command-line tool, which is essential for fapMixPlus.py
to function correctly.
- Verify the installation by checking the version:
fap --version
Usage
Command-line Arguments
| Argument | Description |
|-----------------|----------------------------------------------------------------------|
| --url
| URL of the audio source (YouTube or other supported link). |
| --output_dir
| Directory for saving all outputs. Default is output/
. |
| input_dir
| Path to a local directory of input files (optional if --url
used). |
Example Command
python fapMixPlus.py --url https://youtu.be/example_video --output_dir my_output
This command will download the audio from the URL, process it, and save the results in the my_output
folder.
Output Structure
The output directory will contain a timestamped folder with the following structure:
output_<timestamp>/
├── wav_conversion/ # WAV-converted audio files
├── separation_output/ # Separated vocal track files
├── slicing_output/ # Sliced segments from separated audio
├── final_output/ # Final, sanitized, and renamed .wav and .lab files
├── zip_files/ # Compressed ZIP archives of processed files
ZIP File Details
In addition to organizing output files by processing stages, fapMixPlus
can generate ZIP archives for convenience. Each ZIP file in the zip_files/
directory will contain a set of processed audio and transcription files, with names based on their content and timestamp. The ZIP filenames will follow this format:
output_<timestamp>.zip
Each ZIP file will include:
- The WAV and
.lab
files fromfinal_output/
, with sanitized filenames. - These ZIP files are ideal for transferring or archiving processed audio.
Functionality Details
- Download Audio: Downloads audio from a URL, saving it in
.m4a
format. - WAV Conversion: Converts audio to WAV using
fap to-wav
. - Separation: Separates vocals from the WAV files using
fap separate
. - Slicing: Segments the separated vocal track into smaller audio slices.
- Transcription: Uses
fap transcribe
to transcribe each slice. - Sanitization and Renaming:
- Extracts the first 10 words from each
.lab
file. - Replaces spaces with underscores, removes special characters, and limits to 128 characters.
- Applies a numerical prefix if no valid content is in the
.lab
file.
- Extracts the first 10 words from each
- ZIP File Creation:
- After processing, the final
.wav
and.lab
files are compressed into ZIP archives inzip_files/
for each session, making it easy to organize or share the output.
- After processing, the final
Example File Names in Final Output
Final output files in final_output
will be structured like:
0001_Hello_this_is_a_sample_transcription.wav
0001_Hello_this_is_a_sample_transcription.lab
Files without usable .lab
content will retain the numerical prefix, e.g., 0002.wav
and 0002.lab
.