Holdings: Process Supervision-Guided Policy Optimization for Code Generation

Loading…

View in EDS

Saved in:

Publication Year:

2024

Subject Terms:

Computer Science - Artificial Intelligence, I.2.7

Description:

Reinforcement learning (RL) with unit test feedback has enhanced large language models' (LLMs) code generation, but relies on sparse rewards

Database:

arXiv