Image, video, audio processing, UI frameworks, and design utilities
109 tools
A Swiss Army knife for developers - offline utilities collection
[ArXiv 25] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
Swing Music is a beautiful, self-hosted music player for your local audio files. Like a cooler Spotify ... but bring ...
Spark-TTS Inference Code
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
Bridge the gap between photo and video color grading. Accurately apply any creative LUT to your RAW files with this t...
โโUnlimited-length talking video generationโโ that supports image-to-video and video-to-video generation
Create stunning visual designs with Stage A modern canvas editor that brings your ideas to life. Add images, text, ba...
A native macOS menu bar app for managing audio device priorities
[arXiv 2025] VisualMimic: Visual Humanoid Loco-Manipulation via Motion Tracking and Generation
A general-purpose AI image generation framework that supports Hugging Face, Gitee, Model Scope, and more.
Extract any sound with text prompts. Memory-optimized SAM-Audio with modern UI.
We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThi...
LLM-written music
Mainline Next.js template built with shadcn/ui, Tailwind 4 & Next.js
Your personal voice interface into any app. Speak naturally and your words appear wherever your cursor is, with fully...
Extract any websiteโs design system into tokens in seconds: logo, colors, typography, borders & more. One command.
Ultra Modern 3D Home Assistant dashboard card for monitoring electricity. It includes house usage, Battery States. EV...
Tiny truly local voice-activated LLM Agent that runs on a Raspberry Pi
lsp audio feedback in neovim
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understan...
A modern selfhosted media management system for your media library
Official Python inference and LoRA trainer package for the LTX-2 audioโvideo generative model.