QERL
// QeRL enables RL for 32B LLMs on a single H100 GPU.
QeRL
QeRL enables RL for 32B LLMs on a single H100 GPU.
13EmergingUnknown
What it does
https://github.com/user-attachments/assets/3c9b5b04-0d44-4b68-a4af-059b3d834fc3 QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs [Paper] Wei Huang, Yi Ge, Shuai Yang, Yicheng Xiao, Huizi Mao, Yujun Lin, Hanrong Ye, Sifei Liu, Ka Chun Cheung, Hongxu Yin, Yao Lu, Xiaojuan Qi, Song Han, Yukang Chen We propose QeRL, a Quantization-enhanced Reinforcement Learning
Getting Started
git
git clone https://github.com/NVlabs/QeRL