Machine Learning Runtime reinforcement: preventing “instruction decay” in long reference windows by March 3, 2026 March 3, 2026 Read more
AI Tools Forget Keyword Mimicry: ByteDance AI Maps Molecular Bonds in AI Reasoning to Stabilize Long Chain-Thought Performance and Reinforcement Learning (RL) Training by February 22, 2026 February 22, 2026 Read more
AI News Kyutai releases Hibiki-Zero: A3B parameter simultaneous speech-to-speech translation model using GRPO reinforcement learning without any word-level aligned data by February 13, 2026 February 13, 2026 Read more
AI News A coding implementation to train safety-critical reinforcement learning agents offline using d3rlpy and conservative Q-learning with fixed historical data by February 4, 2026 February 4, 2026 Read more
AI Tools Nuss Research releases NussCoder-14b: a competitive Olympiad programming model, post-trained on QUEN3-14b via reinforcement learning. by January 19, 2026 January 19, 2026 Read more
AI Tools Meet SETA: open source training reinforcement learning environment for terminal agents with 400 tasks and CAMEL toolkit by January 11, 2026 January 11, 2026 Read more
AI Basics Why even reinforcement learning can’t beat casinos (and why I created a simulation to prove it) by January 2, 2026 January 2, 2026 Read more
Generative AI Liquid AI’s LFM2-2.6B-Exp uses pure reinforcement learning RL and dynamic hybrid reasoning to optimize small model behavior by December 28, 2025 December 28, 2025 Read more