Search results
(1 - 1 of 1)
- Title
- Explaining the Predictions of Image Classifiers
- Creator
- Yang, Ruo
- Date
- 2024
- Description
-
With the deployment of deep neural network (DNN) models for safety-critical applications such as autonomous driving and medical diagnosis,...
Show moreWith the deployment of deep neural network (DNN) models for safety-critical applications such as autonomous driving and medical diagnosis, explaining the decisions of DNNs has become a critical concern. For humans to trust the decision of DNNs, not only the model must perform well on the specified task, it must also generate explanations that are easy to interpret. There is a significant amount of research that investigates the contributions of features, in a given instance, to the model’s prediction, where the contribution constitutes the explanation for the model’s decision. Specifically, in the computer vision domain, the explanation method often generates a saliency map that indicates the importance of the pixels for DNNs to make the prediction from the original input image. I propose explanation approaches that generate better saliency maps that represent the importance of the pixels more accurately and evaluate models’ decision-making reasoning from a human perspective.First, I investigate the source of the noise generated by a well-known explanation method, Integrated Gradient (IG), and its variants. Specifically, I propose the Important Direction Gradient Integration (IDGI) framework, which can be incorporated into all IG-based explanation methods and reduce the noise in their outputs. Additionally, I proposed a novel measurement for assessing the attribution techniques’ quality, i.e., the Accuracy Information Curve (AIC) and the Softmax Information Curve (SIC) using the Multi-scale Structural Similarity Index Measure (MS-SSIM). We show that this metric offers a more precise measurement than the original AIC and SIC. Extensive experiments show that IDGI can drastically improve the quality of saliency maps generated by the underlying IG-based approaches.Second, I introduce Information Propagation, IProp, a novel explanation method that leverages the local structural relationships of pixels. Specifically, IProp considers each pixel as the source of information in a saliency map, and formulates the model explanation through information propagation among pixels. Hence, IProp constructs the saliency map by considering all pixels’ contributions to the prediction jointly. I prove that IProp is guaranteed to converge to the unique solution and is compatible with any existing explanation method. The extensive evaluations show the advantage of applying IProp to the existing explanation methods.In the final chapter, I present a methodology for generating meaningful explanations from a human perspective and evaluate if the model’s rationale agrees with human reasoning. We propose a new framework for evaluating how models make decisions in comparison to humans. We propose a novel evaluation metric to measure the model misalignment with the human decision-making process. We show empirically that complex models have more misalignment with humans than simpler models.
Show less