Stable Diffusion Trainer: https://github.com/bmaltais/kohya_ss
Tag manager and captioner for image datasets: https://github.com/jhc13/taggui
A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.
VITS-based Voice Conversion focused on simplicity, quality and performance
Phased Consistency Model - generate high quality images with 2 steps https://huggingface.co/spaces/radames/Phased-Consistency-Model-PCM
LivePortrait - img2vid only: 6GB VRAM & ~8GB download
Dense Text-to-Image Generation with Attention Modulation
[NVIDIA GPU ONLY] One click installer for Intel's ldm3d
An open source implementation of Microsoft's VALL-E X zero-shot TTS model
1 Click Installer for kohya_ss, a Stable Diffusion LoRa & Dreambooth WebUI (https://github.com/bmaltais/kohya_ss)
Temporally consistent video editing. A local version of https://huggingface.co/spaces/weizmannscience/tokenflow
Generate stunning illusion artwork with StableDiffusion (A space by @angrypenguinPNGAP - created with Monster Labs QR ControlNet.
A Gradio web UI for Large Language Models https://github.com/oobabooga/text-generation-webui


Fast Speech-to-Text Web UI with Apple MLX and OpenAI Whisper
AI Song Generation with Full Style Control - Generate complete songs with lyrics, vocals, and instrumental tracks using Tencent AI Lab's SongGeneration (LeVo) model.