💡 This post is initially focused on interpretability for multimodal models, while later a lot of papers in other fields are included, just for convenience.

Methods

Interpretability for MLLMs

Interpretability for Diffusion Models

Other fields of MLLMs

Datasets & Benchmarks

general

spatial

video

hallucination

Models

LLM

MLLM

self-supervised learning

generative models

world models