The point of a World AI Summit you would think, is to show how cool AI is, how far many techniques have come and how astonishing some of its applications are. This especially holds given that we are currently experiencing an AI revolution, where more and more AI is integrated in our everyday life and tasks that seemed impossible before (beating humans at Go, optimizing many computer vision tasks) are being solved with impressive performances.
However, it is also in these times that we need a devil’s advocate who is not afraid to look at all this hype from a different perspective. Someone who puts a brake on the current ‘hype train’ and invites us to reflect upon which direction we are heading to. At the AI summit there were not only one, but multiple of such avocats. Gary Marcus, author of the book ‘Rebooting AI: Building Artificial Intelligence We Can Trust’ argues that we should not put all our trust in currently popular deep learning methods. As he puts it, “deep learning is not a substitute for deep understanding”: we should start building more models with more transparent decision processes that have a more logical way of reasoning. Similarly, tech-company Accenture starts their talk with a video about the importance of Explainable and Ethical AI, highlighting that we cannot deploy AI-systems if trust in them is lacking. Cassie Kozyrkov, Chief Decision Scientist at Google, warns about using Machine Learning algorithms when we do not fully understand the data they are trained on or if the data is potentially biased.
By the end of the summit it had become clear to everyone: Yes, we need to rethink AI and we definitely need more ethical and more transparent AI systems. Systems we can understand and trust and that are different from the black-boxes we are using at the moment. However, a question that might be left unanswered is how to actually get there. It’s probably so hard to answer this question, because there is not a clear answer to it yet and research in this area has just started to bloom.
Nevertheless, there is no harm in at least touching the surface of current literature and understand how a more transparent and understandable (and therefore potentially ethical) AI system could look like. So et voilà, here are some examples of where the research fields of fair Machine Learning and explainable AI are currently heading:
Feature Importance Values:
Say we have trained a Machine Learning model on a set of résumés and a decision for each applicant whether they have been hired or not. The task of the model now is it to predict for the résumé of a new applicant whether they will get the job. Given the nature of the task and the potential sensitive data that the model has been trained on (like a person’s sex or nationality), it is desirable that the model can not only give an output, but also an explanation that goes along with it.
Techniques that can give explanations for the output for one input-instance are so called local explanation techniques. One method of providing local explanations are feature importance values, obtained by algorithms like LIME or SHAP. Like the name implies, this method assigns for each feature of an input an importance value, reflecting how important the feature was in the decision making progress. Looking at these values, it should then also be possible to assess whether a certain decision was fair or not. In case of a recruitment system, we would probably aim for a model that assigns very low feature importance values on factors like a person’s gender, but high importance values on features relating to a person’s education or skill set.
Methods like LIME and SHAP can even be used on image data. The figure below shows a famous example, of where the pixels of an image are highlighted that contribute most to possible labels for the image.
For those interested in finding more information about feature importance measures, can take a look at the papers the methods originated from [1, 2]. Both methods are also implemented in Python tool-packages and can easily be played around with1.
When trying to make AI-systems explain their decision processes it can be beneficial to look at the way humans explain decisions. An explainable AI-method that aligns well with the way humans reason about their behavior are so called counterfactuals. The use of counterfactuals is another type of local explanation method that show which input-features need to be changed in order for the model’s output to be changed as well. In case for the recruitment system it might e.g. show that an applicant wouldn’t be hired if they had one year less of work experience. Again the fairness of the model can be inspected, by reasoning about the nature of the counterfactuals: a model is e.g. hardly fair, if a counterfactual would show that a person wouldn’t be hired if their nationality is Polish rather than Dutch.
To gain more insights into counterfactuals you can take a look at e.g. . Again multiple methods have been implemented in Python and are definitely worth checking out2!
Illusion of Control
So, can feature importance values or counterfactuals be solutions to the problems described during the many World Summit Talks? Yes possibly, but we need to remain cautious. John Danaher, author of Automation and Utopia: Human Flourishing in a World Without Work, warned during the Summit about the Illusion of control. If we just blindly trust the explanations provided for ML algorithms, we are not doing any better than we are doing now. After all, it is again machines that do the work. Do we know for sure that the techniques accurately explain the workings of the model? Are they enough to completely let us break out of the black-box? Or are the algorithms just a new black-box themselves? Who are the people who inspect the explanations and judge models’ fairness based on them?
Again, these questions show that we need to keep the discussion about a new route for AI going. Only if we manage to solve these issues we can lean back a bit and enjoy the hype AI brings.
References Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. “Why should i trust you?: Explaining the predictions of any classifier.” In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135-1144. ACM, 2016.  Lundberg, Scott M., and Su-In Lee. “A unified approach to interpreting model predictions.” In Advances in Neural Information Processing Systems, pp. 4765-4774. 2017.  Byrne, R. “Counterfactuals in explainable artificial intelligence (XAI): evidence from human reasoning.” In international joint conference on AI (IJCAI 2019). 2019.