Reinforcement Learning Teachers of Test Time Scaling - Sakana AI

Retrieved on: 2025-06-23 02:45:34

Tags for this article:

Natural language processing

Machine learning

Artificial intelligence

Large language models

Computational neuroscience

Reasoning language model

DeepSeek

Reflection

Reinforcement learning

Prompt engineering

Learning

Neural network

Click the tags to see associated articles and topics

Reinforcement Learning Teachers of Test Time Scaling - Sakana AI. View article details on hiswai:

Excerpt

We introduce a new way to teach large language models (LLMs) how to reason by learning to teach, not solve. ... Reinforcement Learned Teacher (RLT) ...

Article found on: sakana.ai

View Original Article