T
ToolShelf
DFLASH
// DFlash: Block Diffusion for Flash Speculative Decoding

dflash

DFlash: Block Diffusion for Flash Speculative Decoding

13EmergingUnknown
License
MIT
Updated
Today

What it does

Paper | Blog | Models DFlash is a lightweight block diffusion model designed for speculative decoding. It enables efficient and high-quality parallel drafting. https://github.com/user-attachments/assets/5b29cabb-eb95-44c9-8ffe-367c0758de8c - openai/gpt-oss-20b: https://huggingface.co/z-lab/gpt-oss-20b-DFlash - Qwen3-4B: https://huggingface.co/z-lab/Qwen3-4B-DFlash-b16 - Qwen3-8B:

Getting Started

git
git clone https://github.com/z-lab/dflash

Platforms

🪟windows🍎mac🐧linux

Install Difficulty

moderate

Built With

python

Community Reactions