Blog: Framework for Explainability
In this article is debated the necessity to produce a framework for explainability and are present some insights and advances
Explainability is the science to interpret what an algorithm did or might have done. The new GPDR regulation demands that every prediction needs to be followed by an explanation? Explanations are made in order to make trust and robust predictions.
Despite existing tools and techniques to work with explainability, there are far away to be a general framework. The actual tools do not have a formal measure or output for explanations. In this sense is impossible to affirm that model A has better interpretability than model B or explanation X is better than Y.
Actual tools are based, mostly, in post-hoc analysis (such as LIME or SHAP). The explanation is made post-mortem; the algorithm is trained and used to predict. After that, and independently each prediction is explained.
Post-hoc analysis is not very flexible. Improve the system, eliminating bias or forcing the algorithm to learn singularities set by the user is hard. These reasons encourage to work on a Human-in-the-loop approach. In this strategy, the algorithm learns according to live feedback (the model output an explanation of what is learning and the user change it to his desired output) Using this metodhology, problems such as bias are controlled during the learning stage and not after (post-hoc analysis).
Some properties of an Human-in-the-loop approach for explainability are the following:
Causal Discovery: iterating explanations (human-algorithm) the algorithm must converge into the desired output. The system suggests relationships in the data and the user point if are correlations or causal relationships.
Ontology Alignment determining correspondences between each dimension in the dataset and concept in ontologies might help causal discovery and friendly explanations
Traceability: in each iteration, the algorithm is updated, this information needs to be stored and be accessible to the user
.. . . TBD
Explanations are subjective and influenced by human knowledge and objective of the model. Inferring (prior) Human Knowledge to the system we can measure “interpretability”. A confidence score can be estimated measuring if the explanation contains or not relations traced by the user.
Explanations also need to be Interpretable. Actual Techniques for explainability: Feature Interaction, Local Surrogate, Shapley Values …
Gives intuitions or weights of the most influential variables in each prediction. A scoring system with 1000 individuals and 400 dimensions will produce a matrix of 1000×400 factors (as for instance using Shapley values). This is, of course, is an explanation but far away to be interpretable.
Explanations demands to be made to assist the user. Furthermore, Is crucial to shaping these explanations in a helpful output.
Imagine a ML-model to Forecast the next harvest season. It can be used for several actors: Investor, Scientist and a Farmer where each actor is looking for different targets:
–Investor: It’s a good season to raise money on tomato harvests?
–Scientist Exist any Causal relationship between weather and harvest pests?
– Farmer: What vegetables should I plant to maximize profits?
Furthermore, the way the explanation is displayed differs: a report is enough for the farmer, an investor probably prefers a Bar Chart and the Scientist prefer an Influence networ
Don’t hesitate to contact me for further information :)