STREAMING VLM

// StreamingVLM: Real-Time Understanding for Infinite Video Streams

Streaming Vlm

StreamingVLM: Real-Time Understanding for Infinite Video Streams

13EmergingUnknown

See Alternatives Compare...

License

MIT

Updated

Today

What it does

Paper | Slides | Demo Page StreamingVLM enables real-time, stable understanding of effectively infinite video by keeping a compact KV cache and aligning training with streaming inference. It avoids quadratic cost and sliding-window pitfalls, runs up to 8 FPS on a single H100, and wins 66.18% vs GPT-4o mini on a new long-video benchmark. It also boosts general VQA without task-specific finetuning.

Getting Started

git

git clone https://github.com/mit-han-lab/streaming-vlm

Links

Platforms

🪟windows🍎mac🐧linux

Install Difficulty

moderate

Built With

python

Community Reactions

Similar Tools

See all alternatives →

14Emerging

n8n

Workflow automation tool with a visual builder and code flexibility

Unknownfreemium

13Emerging

WezTerm

A GPU-accelerated terminal emulator configured with Lua

Unknownopen-source

13Emerging

lazygit

A simple terminal UI for Git commands

Unknownopen-source

13Emerging

Textual

Render and customize rich attributed text in SwiftUI

Unknownopen-source

13Emerging

FramePack

Lets make video diffusion practical!

Unknownopen-source

13Emerging

Supertonic

Lightning-Fast, On-Device TTS — running natively via ONNX.

Unknownopen-source