Press
esc
to close
请输入并搜索
奇变偶不变
奇变偶不变
首页
标签
分类
时间线
友链
关于
Press
Ctrl
+
and
K
to search
代码刷题
NLP
CS_杂项
论文阅读
MATH
首页
标签
分类
时间线
友链
关于
后台
RL
3 文章 × 19340 字
2025
3篇
+
07-17
[arXiv-2025] The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning
06-30
[ICLR-2024] Eureka: Human-Level Reward Design via Coding Large Language Models
06-26
[ICML-2025] R*: Efficient Reward Design via Reward Structure Evolution and Parameter Alignment Optimization with Large Language Models
Geaming
NLP搬砖人
81
日志
5
分类
17
标签