Image, video, audio processing, UI frameworks, and design utilities
109 tools
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
Portable file server with accelerated resumable uploads, dedup, WebDAV, SFTP, FTP, TFTP, zeroconf, media indexer, thu...
SoTA open-source TTS
Speakr is a personal, self-hosted web application designed for transcribing audio recordings
Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multi...
A tool to snap pixels to a perfect grid. Designed to fix messy and inconsistent pixel art generated by AI.
LongLive: Real-time Interactive Long Video Generation
A high-quality rapid TTS voice cloning model that reaches speeds of 150x realtime.
Diagram as Code Tool Written in Rust with Draggable Editing
๐ฅ Visual workflow builder for AI agents powered by Firecrawl - drag-and-drop web scraping pipelines with real-time e...
OpenReel Video - Professional browser-based video editor. Open source CapCut alternative. 100% browser-based, no inst...
ViPE: Video Pose Engine for Geometric 3D Perception
Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.
a free local self hosted video compressor webui designed for performance and ease of use. inspired by 8mb.video
Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"
voice activated sticker dreamer and printer.
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery
Glass Keep is Keep Notes alternative using Glass design. Made in React + Tailwind
Fast markdown preview server with live reload and theme support.
Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
Official repo for paper "Video-As-Prompt: Unified Semantic Control for Video Generation"
MoCha: End-to-End Video Character Replacement without Structural Guidance
From baby GPT to diffusion GPT: An annotated implementation of a character-level discrete diffusion model (adapted fr...
Conversational voice AI agents