DYNAMO
// A Datacenter Scale Distributed Inference Serving Framework
dynamo
A Datacenter Scale Distributed Inference Serving Framework
13EmergingUnknown
What it does
| Roadmap | Support Matrix | Docs | Recipes | Examples | Prebuilt Containers | Design Proposals | Blogs High-throughput, low-latency inference framework designed for serving generative AI and reasoning models in multi-node distributed environments. Large language models exceed single-GPU capacity. Tensor parallelism spreads layers across GPUs but creates coordination challenges. Dynamo closes
Getting Started
git
git clone https://github.com/ai-dynamo/dynamo