Reinforcement Learning Models

News

Microsoft’s new AI framework trains powerful reasoning models with a fraction of the cost

The rStar2-Agent framework boosts a 14B model to outperform a 671B giant, offering a path to state-of-the-art AI without ...

16h

Wang Mengdi's Team Launches TraceRL: Moving Towards a Unified Reinforcement Learning Framework for Diffusion Language Models

Recently, the team led by Professor Wang Mengdi at Princeton University proposed a “Trajectory-Aware RL” framework—TraceRL in ...

The Information

Everyone Wants To Be a Reinforcement Learning Startup

These days, artificial intelligence developers, investors and founders are all obsessed with “reinforcement learning,” a ...

Cursor Tab Model Upgrade: Real-time Reinforcement Learning Empowers More Accurate AI Programming Suggestions with Soaring Adoption Rates

The Cursor R&D team has breakthrough adopted a reinforcement learning framework, allowing the model to directly learn user behavior patterns through a policy gradient algorithm. When suggestions are ...

5don MSN

Hide inaccessible results

News

Microsoft’s new AI framework trains powerful reasoning models with a fraction of the cost

Wang Mengdi's Team Launches TraceRL: Moving Towards a Unified Reinforcement Learning Framework for Diffusion Language Models

Everyone Wants To Be a Reinforcement Learning Startup

Cursor Tab Model Upgrade: Real-time Reinforcement Learning Empowers More Accurate AI Programming Suggestions with Soaring Adoption Rates

Baidu Unveils Ernie X1.1 Deep Thinking Model, Claims It Outperforms DeepSeek R1

Cursor is Using Real Time Reinforcement Learning to Improve Suggestions for Developers

CoreWeave to Acquire OpenPipe, Leader in Reinforcement Learning

Reinforcement Learning Does NOT Fundamentally Improve AI Models

DeepSeek-R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost

How the Brain Uses Reinforcement Learning Beyond Just Mean Rewards