Press esc to close

请输入并搜索
奇变偶不变
奇变偶不变
  • 首页
  • 标签
  • 分类
  • 时间线
  • 友链
  • 关于
Press Ctrl+ and K to search
  • 代码刷题
  • NLP
  • CS_杂项
  • 论文阅读
  • MATH
  • 首页
  • 标签
  • 分类
  • 时间线
  • 友链
  • 关于
  • 后台
[ICML-2025] R*: Efficient Reward Design via Reward Structure Evolution and Parameter Alignment Optimization with Large Language Models
编辑
2025-06-26
论文阅读
0
阅读全文
[arXiv-2025] AdaptThink: Reasoning Models Can Learn When to Think
编辑
2025-06-20
论文阅读
0
ResourceInfo
Paperhttp://arxiv.org/abs/2505.13417
Code & Datahttps://github.com/THU-KEG/AdaptThink
PublicarXiv
Date2025.06.20
阅读全文
Raise the Ceiling: Clip-Higher
编辑
2025-06-20
NLP
0

Raise the Ceiling: Clip-Higher

Cite from: DAPO: An Open-Source LLM Reinforcement Learning System at Scale

阅读全文
[ICLR-2025] To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
编辑
2025-06-10
论文阅读
0
ResourceInfo
Paperhttps://arxiv.org/abs/2409.12183
Code & Datahttps://github.com/Zayne-sprague/To-CoT-or-not-to-CoT
PublicICLR
Date2025.06.10
阅读全文
[arXiv-2025] OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
编辑
2025-06-05
论文阅读
0
ResourceInfo
Paperhttp://arxiv.org/abs/2506.02397
Code & Datahttps://github.com/AgenticIR-Lab/OThink-R1
PublicarXiv
Date2025.06.05
阅读全文
    ‹
    1
    2
    3
    4
    •••
    16
    ›
author logo
Geaming
NLP搬砖人
76
日志
5
分类
15
标签

ICP 编号: 蜀ICP备2022026375号-1

本站居然运行了

Powered By VanBlog v0.54.0

© 2022 - 2025

00