LLM Agents
The Purpose I Write This Blog LLM-based agent is gonna change the world. Amazing agent systems have been created to change our life. Since I was once in a team that aimed to build advanced agents for the control of digital devices and for which I was impressed, I want to keep to collect information about LLM agents and share my thoughts here. Resource GUI Agents survey Large Language Model-Brained GUI Agents: A Survey GUI Agent综述 : 揭秘GUI智能体的前世今生-1 : 总览篇-启程 models autoglm ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents MobileRL: Advancing Mobile Use Agents With Adaptive Online Reinforcement Learning Autoglm: Autonomous foundation agents for guis WebRL:Training llm web agents via self-evolving online curriculum reinforcement learning AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents others DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents Appagent: Multimodal agents as smartphone users (SeeAct) GPT-4V(ision) is a Generalist Web Agent, if Grounded Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V benchmarks web WebArena: A Realistic Web Environment for Building Autonomous Agents Mind2web: Towards a generalist agent for the web Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments (MiniWob) World of Bits: An Open-Domain Platform for Web-Based Agents android Android in the Wild: A Large-Scale Dataset for Android Device Control (AndroidArena) Understanding the weakness of large language model agents within a complex android environment DeepResearch survey Deep Research Agents: A Systematic Examination And Roadmap Towards AI Search Paradigm models Search-o1: Agentic search-enhanced large reasoning models Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning R1-searcher: Incentivizing the search capability in llms via reinforcement learning repo (Jina) node-DeepResearch Public Kimi-Researcher: End-to-End RL Training for Emerging Agentic Capabilities Language Modeling by Language Models Agentic RL Reasoning LLM(四):Agentic RL Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning