explainability-A

This is one of two notes about "explainability"; see also explainability-B

Definition of Explainability

The term explainability is used in the field of artificial intelligence to indicate the 'explainability' of a AI-based model. The explainability of a model refers to the degree to which a human can understand the steps that occur within the model that lead it from input to output. An explainable model is a model that allows for humans to understand how it arrived at a certain outcome. The term explainability should thus be seen as a sort of measure that indicates how well humans can understand the steps AI models make to arrive at a certain outcome.

A concept that is closely related to explainability is transparency. Explainability focuses on the degree to which a model's steps (from input to output) can be understood by humans. Transparency focuses on a more wholistic understanding of a AI-based model. To clarify this, transparency is also concerned with the development of a model (what data is it being trained on?), its deployment (where is it deployed?), its use-cases (when is it being used and with what purpose?). In short, the main difference between the two concepts is that explainability focuses solely on what occurs 'within' the model and transparency focuses on the entire openness regarding the model's use-cases, inner workings and data that it requires.

Another concept that is strongly related to explainability is opaqueness. Opaque models are models that are not explainable and often contain a black-box component. This means that the inner workings of an opaque model are inherently impossible to unveil as they are too complex to understand for the human brain. Outside of the field of AI, opaqueness is often used to describe something that cannot be seen through or is hard to understand. In the field of AI, the same definition applies but specifically to the inner workings of machine learning models. The difference between explainability and opaqueness is that explainability is a measure for how explainable a model is, which can go both ways whereas opaqueness refers to the specific non-explainability of a model.

Implications of commitment to explainability

Building explainable AI models is desirable as it . However, recent advancements in the field of machine learning have shown that more complex models (often with a deep-learning architecture) tend to achieve better quantitative performance when compared to simpler models. These complex models are inherently less explainable and more opaque than the simpler models (due to their complex architecture). This has led to the explainability-performance trade-off where in order to maximize performance, one must sacrifice explainability, and vice versa. It is therefore of great importance to not focus solely on the quantitative performance of a machine learning model (e.g. the accuracy) but also on other qualitative metrics (e.g. explainability). If this is not done, the automated decision-making processes in our society will cease to be understandable and individuals will not be able to understand decisions that affect them. To conclude, what is at stake here is the individual's ability to understand why certain decisions have been made that personally affect them. The key requirement to prevent this situation is to focus on the importance of explainability by evaluating models' performances on a combination of quantitative and qualitative metrics.

Societal transformations required for addressing concern raised by explainability

One of the most important changes required to maintain control and insights regarding the workings of machine learning models is to focus on qualitative performance evaluation instead of quantitative performance evaluation. A large part of the academic research towards machine learning models focuses on the optimization and increase of the accuracy (or other closely related metrics) of predictive models. A shift towards a focus on qualitative performance measures would result in more explainable models being used. This applies to both academia, where the focus in research lies in the improvement of model's quantitative performance and to the private sector, where the focus lies on maximizing profits, which often goes hand in hand with using the most accurate predictive models.