Reinforcement Learning Models

News

Microsoft’s new AI framework trains powerful reasoning models with a fraction of the cost

The rStar2-Agent framework boosts a 14B model to outperform a 671B giant, offering a path to state-of-the-art AI without ...

24m

SJTU and ByteDance Join Forces to Launch RhymeRL, Enhancing Reinforcement Learning Training Efficiency by 2.6 Times!

The breakthrough of RhymeRL stems from an in-depth analysis of the Rollout phase in the reinforcement learning process. The research found that there is a significant “historical similarity” in the ...

INFLY TECH Team Reveals the Mystery of Diversity Collapse in Large Model Training, Innovating Methods to Reshape the Future of AI

In September 2025, the INFLY TECH team, in collaboration with Fudan University and Griffith University, released a groundbreaking study that delves into a puzzling phenomenon encountered during the ...

8don MSN

Show inaccessible results

News

Microsoft’s new AI framework trains powerful reasoning models with a fraction of the cost

SJTU and ByteDance Join Forces to Launch RhymeRL, Enhancing Reinforcement Learning Training Efficiency by 2.6 Times!

INFLY TECH Team Reveals the Mystery of Diversity Collapse in Large Model Training, Innovating Methods to Reshape the Future of AI

CoreWeave to acquire OpenPipe, a Seattle-area startup that uses reinforcement learning to help companies build AI agents

Baidu Unveils Reasoning Model ERNIE X1.1 with Upgrades in Key Capabilities

Reinforcement Learning Does NOT Fundamentally Improve AI Models

Baidu Unveils Ernie X1.1 Deep Thinking Model, Claims It Outperforms DeepSeek R1

CoreWeave to Acquire OpenPipe, Leader in Reinforcement Learning

DeepSeek-R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost

Why AI Cheats: The Deep Psychology Behind Deep Learning

Reinforcement learning is making a buzz in space