The Purpose I Write This Blog

   Thinking models are crazily popualr nowadays. The first time I delved in this area was in September, 2023. Later I gradually forgetted this area, until Deepseek came to life. I want to keep to collect information about LLM reasoning (as well as post-training) and share my thoughts here.

💡 This post is mainly focused on reasoning RL. For agentic RL, please refer to this post.

Reinforcement Learning

Blogs

RL algorithms

Engineering

Analyses

Thinking Models

text-based

overthinking

parallel thinking

visual reasoning

Long Context

others

Evaluation

dataset

Analyses

implicit reasoning

interpretability

theories