Interpretability for Multimodal Models
💡 This post is initially focused on interpretability for multimodal models, while later a lot of papers in other fields are included, just for convenience. Resource Interpretability for MLLMs survey A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models Sparks of Explainability Recent Advancements in Explaining Large Vision Models Awesome LMMs Mechanistic Interpretability probing Probing Multimodal Large Language Models for Global and Local Semantic Representations representation Zoom in: An introduction to circuits Multimodal Neurons in Artificial Neural Networks Interpreting CLIP’s Image Representation via Text-Based Decomposition Interpreting the Second-Order Effects of Neurons in CLIP CLIP不同层 Multimodal Neurons in Pretrained Text-Only Transformers circuit **(causal tracing) Understanding Information Storage and Transfer in Multi-modal Large Language Models Automatic Discovery of Visual Circuits Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP SAE Case study: Interpreting, manipulating, and controlling clip with sparse autoencoders Towards multimodal interpretability: Learning sparse interpretable features in vision transformers Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery visualization Visualizer!简化你的Vision Transformer可视化! (DVT) Denoising Vision Transformers Token Activation Map to Visually Explain Multimodal LLMs LVLM-Intrepret: An Interpretability Tool for Large Vision Language Models Transformer Interpretability Beyond Attention Visualization others **Towards interpreting visual information processing in vision-language models demo (dogit lens) Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space information flow **Cross-modal Information Flow in Multimodal Large Language Models *From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks *What’s in the Image? A Deep-Dive into the Vision of Vision Language Models The Narrow Gate: Localized Image-Text Communication in Vision-Language Models Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference analyses on MLLMs Words or Vision: Do Vision-Language Models Have Blind Faith in Text? Forgotten Polygons: Multimodal Large Language Models are Shape-Blind Vision Transformers Need Registers On the rankability of visual embeddings Other fields of MLLMs visual pretraining ...