Best Media & Design Tools Tools for Python

53 curated media & design tools tools for Python developers, ranked by quality score.

54Fair

copyparty

Portable file server with accelerated resumable uploads, dedup, WebDAV, SFTP, FTP, TFTP, zeroconf, media indexer, thu...

Active42.8kopen-source

🪟🍎🐧

52Fair

Kitty

A fast, feature-rich GPU-accelerated terminal emulator

Active31.6kopen-source

🍎🐧

44Fair

chatterbox

SoTA open-source TTS

Active22.9kopen-source

🪟🍎🐧

38Emerging

MediaManager

A modern selfhosted media management system for your media library

Active3.1kopen-source

🪟🍎🐧

37Emerging

Skyfall GS

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Active722open-source

🪟🍎🐧

37Emerging

Swingmusic

Swing Music is a beautiful, self-hosted music player for your local audio files. Like a cooler Spotify ... but bring ...

Active1.8kopen-source

🪟🍎🐧

37Emerging

Bolna

Conversational voice AI agents

Active587open-source

🪟🍎🐧

37Emerging

LongLive

LongLive: Real-time Interactive Long Video Generation

Active1.1kopen-source

🪟🍎🐧

35Emerging

heartlib

HeartMuLa Official Repo: The Most Powerful Open-Source Music Generation Model of 2026

Active4.2kopen-source

🪟🍎🐧

35Emerging

Neutts Air

On-device TTS model by Neuphonic

Active4.9kopen-source

🪟🍎🐧

34Emerging

LTX-2

Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.

Active4.2kopen-source

🪟🍎🐧

34Emerging

HunyuanImage 3.0

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Active2.9kopen-source

🪟🍎🐧

33Emerging

Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expr...

Active8.8kopen-source

🪟🍎🐧

33Emerging

LTX Video

Official repository for LTX-Video

Slowing9.4kopen-source

🪟🍎🐧

33Emerging

Rcm

rCM: SOTA Diffusion Distillation & Few-Step Video Generation based on sCM/MeanFlow

Active552open-source

🪟🍎🐧

33Emerging

HunyuanWorld Mirror

Fast and Universal 3D reconstruction model for versatile tasks

Active1.0kopen-source

🪟🍎🐧

33Emerging

FlashWorld

Code for "FlashWorld: High-quality 3D Scene Generation within Seconds"

Active686open-source

🪟🍎🐧

33Emerging

Fun Audio Chat

Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.

Active878open-source

🪟🍎🐧

33Emerging

Wan Alpha

High-Quality Text-to-Video Generation with Alpha Channel

Active338open-source

🪟🍎🐧

32Emerging

Puffin

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Active383open-source

🪟🍎🐧

32Emerging

Video As Prompt

Official repo for paper "Video-As-Prompt: Unified Semantic Control for Video Generation"

Active389open-source

🪟🍎🐧

32Emerging

Thinking With Video

We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThi...

Active266open-source

🪟🍎🐧

30Emerging

PromptEnhancer

PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image genera...

Slowing3.6kopen-source

🪟🍎🐧

29Emerging

LuxTTS

A high-quality rapid TTS voice cloning model that reaches speeds of 150x realtime.

Active831open-source

🪟🍎🐧

29Emerging

VideoMaMa

Official implementation of "VideoMaMa: Mask-Guided Video Matting via Generative Prior", CVPR 2026

Active281open-source

🪟🍎🐧

29Emerging

HunyuanWorld Voyager

Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstr...

Slowing1.5kopen-source

🪟🍎🐧

29Emerging

Dolphin

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

Slowing8.9kopen-source

🪟🍎🐧

29Emerging

FastGS

Offical code for "FastGS: Training 3D Gaussian Splatting in 100 Seconds"

Active779open-source

🪟🍎🐧

29Emerging

Vipe

ViPE: Video Pose Engine for Geometric 3D Perception

Slowing1.7kopen-source

🪟🍎🐧

29Emerging

Stable Video Infinity

[ArXiv 25] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

Slowing2.1kopen-source

🪟🍎🐧

29Emerging

Step Audio EditX

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style,...

Active873open-source

🪟🍎🐧

29Emerging

8mb.local

a free local self hosted video compressor webui designed for performance and ease of use. inspired by 8mb.video

Slowing759open-source

🪟🍎🐧

29Emerging

Raw Alchemy

Bridge the gap between photo and video color grading. Accurately apply any creative LUT to your RAW files with this t...

Active366open-source

🪟🍎🐧

28Emerging

FIBO

FIBO is a SOTA, first open-source, JSON-native text-to-image model built for controllable, predictable, and legally s...

Slowing304open-source

🪟🍎🐧

27Emerging

FramePack

Lets make video diffusion practical!

Slowing16.6kopen-source

🪟🍎🐧

27Emerging

InfiniteTalk

Unlimited-length talking video generation that supports image-to-video and video-to-video generation

Slowing4.9kopen-source

🪟🍎🐧

25Emerging

Qwen3-ASR

Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multi...

Slowing1.8kopen-source

🪟🍎🐧

25Emerging

Qwen Image Layered

Qwen-Image-Layered: Layered Decomposition for Inherent Editablity

Slowing1.6kopen-source

🪟🍎🐧

25Emerging

StoryMem

Official code for StoryMem: Multi-shot Long Video Storytelling with Memory

Slowing655open-source

🪟🍎🐧

25Emerging

SteadyDancer

SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation

Slowing584open-source

🪟🍎🐧

25Emerging

MoCha

MoCha: End-to-End Video Character Replacement without Structural Guidance

Slowing649open-source

🪟🍎🐧

25Emerging

LinaCodec

A highly compressive and high-quality neural audio codec for speech models.

Slowing256open-source

🪟🍎🐧

23Emerging

Code2Video

Video generation via code

Slowing1.6kopen-source

🪟🍎🐧

23Emerging

Paper2Video

Automatic Video Generation from Scientific Papers

Slowing2.1kopen-source

🪟🍎🐧

23Emerging

Ditto

[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Slowing570open-source

🪟🍎🐧

23Emerging

TraceAnything

Trace Anything: Representing Any Video in 4D via Trajectory Fields

Slowing511open-source

🪟🍎🐧

20Emerging

Reader3

Quick illustration of how one can easily read books together with LLMs. It's great and I highly recommend it.

Slowing3.3kopen-source

🪟🍎🐧

20Emerging

Spark TTS

Spark-TTS Inference Code

Stale10.9kopen-source

🪟🍎🐧

19Emerging

HoloCine

Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Slowing633open-source

🪟🍎🐧

19Emerging

Dia2

TTS model capable of streaming conversational audio in realtime.

Slowing1.1kopen-source

🪟🍎🐧

19Emerging

Streaming Vlm

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Slowing892open-source

🪟🍎🐧

19Emerging

Glyph

Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"

Slowing563open-source

🪟🍎🐧

19Emerging

VisualMimic

[arXiv 2025] VisualMimic: Visual Humanoid Loco-Manipulation via Motion Tracking and Generation

Slowing266open-source

🪟🍎🐧

All Python tools All Media & Design Tools