The Purpose I Write This Blog
Amazing agent systems have been created to change our life. I want to keep to collect information about LLM-based agents and share my thoughts here.
Resource
Agentic RL
- survey
- blogs
- papers
- LLM-based agents
- RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
- RAGEN-2: Reasoning Collapse in Agentic RL
- rStar2-Agent: Agentic Reasoning Technical Report
- (GiGPO) Group-in-Group Policy Optimization for LLM Agent Training
- (ARPO) Agentic Reinforced Policy Optimization
- (AEPO) Agentic Entropy-Balanced Policy Optimization
- Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs
- VLM-based agents
- multi-agent
- advantage computation
- MARL credit assignment
- experience replay
- Efficient RL Training for LLMs with Experience Replay
- Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
- AgentHER: Hindsight Experience Replay for LLM Agent Trajectory
- ARPO1: End-to-End Policy Optimization for GUI Agents with Experience Replay
- context management
- self evolution
- data
- emergence
- LLM-based agents
- general evaluation
Coding Agent
survey
blogs
models
- codex
- codex:https://zhuanlan.zhihu.com/p/2029683221646907323
- prompt
- claude *
- codex
benchmarks
GUI Agent
- survey
- models
- UI-TARS: Pioneering Automated GUI Interaction with Native Agents
- autoglm
- ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents
- MobileRL: Advancing Mobile Use Agents With Adaptive Online Reinforcement Learning
- ANDROIDGEN: Building an Android Language Agent under Data Scarcity
- Autoglm: Autonomous foundation agents for guis
- WebRL:Training llm web agents via self-evolving online curriculum reinforcement learning
- AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents
- DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning
- SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
- Appagent: Multimodal agents as smartphone users
- (SeeAct) GPT-4V(ision) is a Generalist Web Agent, if Grounded
- Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
- benchmarks
- web
- android
DeepResearch
- survey
- models
- Search-o1: Agentic search-enhanced large reasoning models
- Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
- R1-searcher: Incentivizing the search capability in llms via reinforcement learning
- (Jina) node-DeepResearch Public
- Kimi-Researcher: End-to-End RL Training for Emerging Agentic Capabilities
- Language Modeling by Language Models