Investigating Methods for Model-Agnostic Explainability of Machine Learning Algorithms


Sophisticated machine learning (ML) models such as deep neural networks have shown to be highly accurate for many applications. However, their complexity virtually makes them black-boxes where usually no information is provided about what exactly leads to their predictions and which decision rules have been applied to get a certain output. High prediction accuracies are often achieved by such complex models, whereas high explainability is rather realized with glass-box models like decision trees or linear regressions. As a result, a tension between interpretability and accuracy is created. To dissolve this tension, the decision-making process of black-box models must become more reasonable for human users. This can be achieved by investigating the transparency, trust, interpretability, confirmability and explainability of the models.

The domain of making AI decisions more interpretable is known as Explainable Artificial Intelligence (XAI) and can be split in different subdomains. Firstly, there is a differentiation between model-specific and model-agnostic approaches. Model-specific explainability is, in opposite to model-agnostic approaches, only applied on a specific ML model type. Secondly, there is a differentiation between local and global explainability. Local explainability aims at investigating individual model predictions, whereas global explainability tries to explain the influence of certain features on all predictions. Thirdly, there can be differentiated between attribute interaction analysis and single attribute analysis. Attribute interaction analysis is related to conditional probabilities and the analysis of attribute combinations that lead to a certain model output. Single attribute analysis is consequentially about investigation of a single attribute and its influence on a model’s predictions.

The need for XAI may stem from legal or social reasons, or from the desire to improve the acceptance and adoption of the machine learning models. XAI recently got a push from legal side by the enactment of the General Data Protection Regulation (GDPR) of the European Parliament. From social reasons, XAI can help to detect discrimination caused by ML algorithms. The audience of XAI therefore mainly consists on the one hand of data scientists and process owners who want to assess the ML model quality during development and/or release processes. On the other hand, XAI has a wide application area for explaining decisions to people with a non-technical background like managers or customer advisors.



The aim of this master thesis is twofold. First, the domain of model-agnostic explainability is investigated by means of a structured and critical literature review. Second, a practical case study will be conducted in which the different approaches are applied on an exemplary real-world ML model and data set, both provided by a client of Viadee AG.

The case study includes an analysis of the applicability and usefulness of the explainability approaches derived within the literature review, as well as an exemplary presentation in which XAI will be embedded into a process scenario of a development, test and release cycle. The XAI approaches to be investigated in this thesis are (amongst others):

  • Local Interpretable Model-agnostic Explanations (LIME) [RSG16],
  • Anchors [RSG18],
  • Partial Dependence Plots (PDP) [Fri01],
  • Individual Conditional Expectation (ICE) Plots [Gol+13],
  • Shapley Explanations [Sha53; SK14],
  • Feature Importance [FRD18], and
  • Feature Interactions [FP08].


Research approach:


In a deductive research, the different model-agnostic XAI approaches will be analyzed and compared. Therefore, several quality attributes like feasibility, implementation and visualization will be investigated. For a further comparison of the XAI approaches concerning practical usability and prediction explainability a qualitative empirical study with sample XAI clients will be performed.




  • [FRD18] Aaron Fisher, Cynthia Rudin and Francesca Dominici (2018). Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the "Rashomon" Perspective.

    In: ArXiV preprint.

  • [Fri01] Jerome H. Friedman (2001). Greedy Function Approximation: A Gradient Boosting Machine.

    In: Annals of Statistics, Volume 29, No. 5, pp. 1189–1232, Institute of Mathematical Statistics.

    DOI: 10.1214/aos/1013203451. URL:
  • [FP08] Jerome H. Friedman and Bogdan E. Popescu (2008). Predictive Learning via Rule Ensembles.

    In: Annals of Applied Statistics, Volume 2, No. 3, pp. 916–954, Institute of Mathematical Statistics.

    DOI: 10.1214/07-AOAS148. URL:
  • [Gol+13] Alex Goldstein, Adam Kapelner, Justin Bleich and Emil Pitkin (2013). Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation.

    In: Journal of Computational and Graphical Statistics, Volume 24, No. 1, pp. 44–65, Taylor & Francis Group.

    DOI: 10.1080/10618600.2014.907095. URL:
  • [RSG16] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin (2016). "Why Should I Trust You?": Explaining the Predictions of Any Classifier.

    In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16),

    New York, NY, USA, pp. 1135–1144, ACM.

    DOI: 10.1145/2939672.2939778. URL:
  • [RSG18] Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin (2018). Anchors: High-Precision Model-Agnostic Explanations.

    In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence,

    New Orleans, Louisiana, USA, pp. 1527-1535, AAAI Press.

  • [Sha53] Lloyd S. Shapley (1953). A Value for n-Person Games.

    In: Contributions to the Theory of Games (AM-28), Volume 2, pp. 307–317, Princeton: Princeton University Press.

    DOI: 10.1515/9781400881970-018. URL:
  • [SK14] Erik Štrumbelj and Igor Kononenko (2014). Explaining Prediction Models and Individual Predictions with Feature Contributions.

    In: Knowledge and Information Systems, Volume 41, No. 3, pp. 647–665, Springer.

    DOI: 10.1007/s10115-013-0679-x. URL: