Holdings: Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use

Loading…

View in EDS

Saved in:

Publication Year:

2025

Subject Terms:

Description:

Reinforcement learning has been shown to improve the performance of large language models. However, traditional approaches like RLHF or RLAI

Database:

arXiv