Skip to main content

External Research

Discover key scientific contributions from the wider research community in Explainable AI (XAI). These works advance transparency, interpretability, and human-centered methods that help align AI systems with real-world understanding and responsible use.

Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies

Kim et al. (2025); Princeton University; Microsoft Research

This article examines how features of large language model (LLM) outputs — in particular explanations, cited sources, and internal consistency — influence whether users trust and rely on those outputs appropriately. In a controlled experiment (N = 308), the authors find that providing explanations tends to increase user reliance on the model’s responses, whether the responses are correct or incorrect. However, when the output includes sources or when explanations show inconsistencies, users are less likely to rely on incorrect responses — helping reduce blind trust. The findings suggest that adding transparency elements like source attribution or observable explanation flaws can help steer users towards more cautious, informed use of LLMs rather than overrelying on them. Thus, the paper argues that careful design of explanation and metadata features is key to fostering responsible use of LLM systems.

Large Language Modelshuman–AI interactionuser relianceexplanationssource attributionexplanation consistencytrustworthy AI

"I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust

Kim et. al. (2024); Princeton University; Microsoft Research

This study investigates how wording uncertainty in LLM responses influences user trust and reliance. In a large user experiment, participants received answers with varying uncertainty expressions. Phrases like “I’m not sure, but…” lowered user confidence and reduced blind reliance on the model — ultimately improving decision accuracy when the model was wrong. The findings show that well-designed uncertainty cues can make AI assistance safer and more responsible by preventing overtrust, but also highlight that specific phrasing strongly shapes user behavior.

Large Language Modelsuncertaintyuser trustdecision-makingAI reliability

AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap

Liao Q. Vera and Jennifer Wortman Vaughan (2024); Microsoft Research

This paper argues that with the rapid rise of large language models (LLMs) and LLM-powered applications, transparency has become more important than ever—but current efforts largely ignore the human side of transparency. The authors claim we need new, human-centered approaches to make LLMs understandable to all stakeholders, from developers and product designers to end-users and impacted individuals. They identify specific challenges for LLM transparency: unpredictable or emergent behaviors of LLMs, massive and opaque model architectures, proprietary restrictions, and the diversity of real-world uses and affected stakeholders. To tackle this, the paper outlines four main transparency strategies: model reporting (disclosing what the model is and what it can do), publishing evaluation results (showing how the model performs), providing explanations (for individual outputs), and communicating uncertainty (when the model is unsure or may err). The authors highlight that there is no “one-size-fits-all” solution — the right transparency approach depends on who the audience is and what their needs are. They present this as a research roadmap, calling on the community to design, test, and adopt human-centered transparency methods that suit the evolving landscape of LLMs and their applications.

AI transparencyLarge Language Modelshuman-centered transparencyresponsible AImodel reportinguncertainty communication

Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations

Chen et. al. (2023), Microsoft Research USA & Canada / Carnegie Mellon University

This study investigates how human intuition influences reliance on AI when explanations are provided. Through a think-aloud experiment, participants made decisions with AI support and received different types of explanations (feature-based vs example-based). The authors analyze how various intuitive reasoning patterns shape when and why people follow or question AI advice. They show that explanations do not uniformly lead to better outcomes: depending on the decision-maker’s intuition and the type of explanation, explanations can improve appropriate reliance — but they can also foster overreliance. The paper emphasizes that understanding human intuition is crucial when designing explainable AI systems, and calls for XAI methods that adapt to human decision styles rather than assuming explanations are always helpful.

Explainable AIhuman–AI decision-makinghuman intuitionrelianceexplanationsfeature-based explanations

From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence

Alvarez-Melis et al. (2021), Microsoft Research

This work develops a human-centered framework for interpretability by grounding machine explanations in principles drawn from how people naturally explain events. The authors identify desirable properties—such as contrastiveness, relevance, and conciseness—and operationalize them using Weight of Evidence (WoE) to generate explanations for complex black-box models. Their method is model-agnostic, robust to small input changes, and usable in high-dimensional settings. A practitioner study shows the explanations are intuitive and aligned with human reasoning. The paper highlights that effective interpretability must start from human needs, not just model structure.

Explainable AIhuman-centered interpretabilityweight of evidencemodel-agnostic explanations

Explainable AI without Interpretable Model

Kary Främling (2020)

This article argues that many current Explainable AI (XAI) methods try to simplify a “black-box” model by building a separate interpretable model — but such surrogate models often fail to truly represent the black-box and may still be hard for end-users to understand. Instead, the author proposes a method called Contextual Importance and Utility (CIU), which produces human-like explanations directly from the original black-box model, without creating an intermediate interpretable model. CIU is model-agnostic and can be applied to any black-box system. It augments traditional feature-importance explanations by also considering the “utility” (how favorable a given input value is in context), allowing explanations at different levels of abstraction and adapted vocabularies depending on the user and context. The paper demonstrates CIU on standard datasets and argues this direct, context-aware approach better supports transparency and user understanding than many existing XAI techniques.

Explainable AIblack-box modelsmodel-agnostic explanationsContextual Importance and Utilityinterpretabilityuser-centered XAI

A Functionally-Grounded Benchmark Framework for XAI Methods: Insights and Foundations from a Systematic Literature Review

Canha, Kubler, Främling & Fagherazzi (2025)

This paper addresses the lack of clear definitions and standardized evaluation criteria in Explainable AI. Through a systematic review, the authors identify key XAI properties—such as faithfulness, robustness, stability, and completeness—and show how inconsistently they are used across the literature. They introduce a functionally-grounded benchmark framework that organizes these properties and links them to objective, user-independent metrics. Applying the framework to common methods like SHAP and LIME demonstrates how it enables consistent comparison of strengths and limitations. The authors argue that such standardized benchmarking is crucial for trustworthy, transparent, and scientifically rigorous XAI development.

Explainable AIbenchmarkingevaluation metricsfaithfulnessrobustnessinterpretability

LLMs for Explainable AI: A Comprehensive Survey

Bilal et al. (2025)

This survey provides an extensive overview of how large language models can advance Explainable AI (XAI). The authors address the challenge that many modern AI models—especially deep neural systems—remain opaque and difficult for users to understand. They outline three major roles for LLMs in explainability: generating post-hoc explanations for existing models, supporting intrinsically interpretable model design, and producing human-centered narrative explanations that make complex decisions accessible to non-experts. The paper reviews current evaluation methods, benchmarks, and real-world applications, emphasizing strengths such as natural-language reasoning and multimodal potential. It also highlights key obstacles, including fairness concerns, data privacy risks, and the difficulty of reliably evaluating LLM-generated explanations. The work underscores the need for improved assessment frameworks and more human-aligned, trustworthy explanation systems that integrate LLM capabilities effectively.

Explainable AILarge Language ModelsXAI techniquesnarrative explanationsinterpretabilityevaluation methods

XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models

Mercorio et. al. (2024)

This survey examines how Explainable AI (XAI) and Large Language Models (LLMs) intersect, focusing on the challenges of making increasingly powerful but opaque language models transparent and trustworthy. It distinguishes between research aimed at explaining LLMs themselves and work that uses LLMs as explanation tools. The paper highlights current gaps—such as limited model-specific explanation methods, weak evaluation standards, and the tension between performance and interpretability. It argues for human-centered, stakeholder-focused explanation strategies and calls for stronger open-source tools and benchmarks to reliably evaluate LLM explainability.

Explainable AILLM explainabilitymodel transparencystakeholder-aligned explanationsXAI methods

Dataset | Mindset = Explainable AI | Interpretable AI

Wu et al. (2024)

In this work the authors clarify the distinction between “interpretable AI (IAI)” and “explainable AI (XAI)”. They argue that IAI is broader—encompassing a mindset and design philosophy—while XAI focuses on post-hoc explanations tied to specific datasets and models. The paper highlights how IAI requires a priori-oriented reasoning (designing systems that are transparent by design), whereas XAI often handles outward reasoning (explaining decisions after they are made). The authors support their stance with empirical experiments on an open dataset, and stress how this differentiation matters for regulatory compliance in sectors like healthcare, HR, banking and finance. They propose a conceptual foundation linking XAI, IAI, ethical AI (EAI) and trustworthy AI (TAI), aiming to guide practitioners and policymakers toward more nuanced, principled AI transparency.

Explainable AIInterpretable AImindset designtransparencyethical AItrustworthy AI