LLM Agents

The Purpose I Write This Blog LLM-based agent is gonna change the world. Amazing agent systems have been created to change our life. Since I was once in a team that aimed to build advanced agents for the control of digital devices and for which I was impressed, I want to keep to collect information about LLM agents and share my thoughts here. Resource Agentic RL survey From Reasoning to Agentic: Credit Assignment in Reinforcement Learning for Large Language Models 知乎解读 papers LLM rStar2-Agent: Agentic Reasoning Technical Report (GiGPO) Group-in-Group Policy Optimization for LLM Agent Training (ARPO) Agentic Reinforced Policy Optimization (AEPO) Agentic Entropy-Balanced Policy Optimization Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs VLM Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning multi-agent advantage computation Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems (M-GRPO) Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO MARL credit assignment (COMA) Counterfactual Multi-Agent Policy Gradients (CCPO) Counterfactual Credit Policy Optimization for Multi-Agent Collaboration (SHARP) Who Deserves the Reward? SHARP: Shapley Credit-based Optimization for Multi-Agent System 合作博弈中夏普利值（Shapley Value）的主要思想、公理及求解公式的理解 (C3) Exact Is Easier: Credit Assignment for Cooperative LLM Agents experience replay Efficient RL Training for LLMs with Experience Replay Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning AgentHER: Hindsight Experience Replay for LLM Agent Trajectory ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay context management (MemAct) Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks self evolution memory MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent meta-RL Meta-RL Induces Exploration in Language Agents test-time scaling online learning emergence From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones ARTIST: Agentic Reasoning and Tool Integration in Self-improving Transformers blogs Reasoning LLM（四）：Agentic RL GUI Agent survey Large Language Model-Brained GUI Agents: A Survey GUI Agent综述 : 揭秘GUI智能体的前世今生-1 : 总览篇-启程 models UI-TARS: Pioneering Automated GUI Interaction with Native Agents autoglm ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents MobileRL: Advancing Mobile Use Agents With Adaptive Online Reinforcement Learning ANDROIDGEN: Building an Android Language Agent under Data Scarcity Autoglm: Autonomous foundation agents for guis WebRL:Training llm web agents via self-evolving online curriculum reinforcement learning AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents Appagent: Multimodal agents as smartphone users (SeeAct) GPT-4V(ision) is a Generalist Web Agent, if Grounded Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V benchmarks web WebArena: A Realistic Web Environment for Building Autonomous Agents Mind2web: Towards a generalist agent for the web Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments (MiniWob) World of Bits: An Open-Domain Platform for Web-Based Agents android Android in the Wild: A Large-Scale Dataset for Android Device Control (AndroidArena) Understanding the weakness of large language model agents within a complex android environment DeepResearch survey Deep Research Agents: A Systematic Examination And Roadmap Towards AI Search Paradigm models Search-o1: Agentic search-enhanced large reasoning models Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning R1-searcher: Incentivizing the search capability in llms via reinforcement learning repo (Jina) node-DeepResearch Public Kimi-Researcher: End-to-End RL Training for Emerging Agentic Capabilities Language Modeling by Language Models AutoResearch survey AI for Auto-Research: Roadmap & User Guide

The Purpose I Write This Blog#

Resource#

Agentic RL#

GUI Agent#

DeepResearch#

AutoResearch#

The Purpose I Write This Blog

Resource

Agentic RL

GUI Agent

DeepResearch

AutoResearch