TINYZERO
// Minimal reproduction of DeepSeek R1-Zero
TinyZero
Minimal reproduction of DeepSeek R1-Zero
13EmergingUnknown
What it does
> ⚠️ Deprecation Notice: This repo is no longer actively maintained. For running RL experiments, please directly use the latest veRL library. > For the archived original documentation, see OLDREADME.md. TinyZero is a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks. We built upon veRL. Through RL, the 3B base LM develops self-verification and search abilities all on its own.
Getting Started
git
git clone https://github.com/Jiayi-Pan/TinyZero