🐙 OctoThinker is led by GAIR

🎯 Our Goal: To reshape the pre-training trajectory so models scale better under RL.