High-Quality Text-to-Speech for Indian Languages
An enhanced local port of finegrain-image-enhancer powered by Refiners (https://huggingface.co/spaces/finegrain/finegrain-image-enhancer), which was adapted from philz1337x's Clarity Upscaler (https://github.com/philz1337x/clarity-upscaler)
[AMD ONLY] Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. (On Windows supported by 7900(XT), 7800(XT), 7600(XT), Phoenix, 9070(XT) and Strix Halo)
Automatically create music videos. Synchronize the cuts to the music's beat.
Ovi is a veo-3 like, video+audio generation model that simultaneously generates both video and audio content from text or text+image inputs.
VibeSurf - AI-powered browser assistant for surfing the web with intelligence
Next-generation face-swapping and enhancement (Codeberg fork of Roop). Easy GUI for images & videos.

LFM2-Audio-1.5B is Liquid AI's first end-to-end audio foundation model. Designed with low latency and real time conversation in mind
Frontier Open-Source Text-to-Speech
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Voice Synthesis Platform with Smart Chunking, Batch Processing, and Voice Cloning capabilities.
Gradio UI for YuE music generation model

Text+Image → Video with Allegro-TI2V (Rhymes AI), local one-click via Pinokio

Pinokio app to install and run sdbds/YuE-for-windows, tuned defaults for a single RTX 4060 Ti 16GB GPU. Uses Torch 2.5.1+cu124 and requirements-uv.txt.
[NVIDIA ONLY] Generate Video Progressively. FramePack is a next-frame (next-frame-section) prediction neural network structure that generates videos progressively. https://github.com/lllyasviel/FramePack

Lip-sync vidéo avec Wav2Lip en CPU sur macOS (Intel)
Uncensored Deepfakes for images and videos without training and an easy-to-use GUI.

One-click install & launcher for MeiGen-AI/InfiniteTalk