Deep learning methods have taken Artificial Intelligence by storm. As their dominance grows, one inconvenient truth is slowly emerging: we don’t actually understand these complex models very well. Our lack of understanding leads to some uncomfortable questions. Do we want to travel in self-driving cars whose inner workings no one really comprehends? Can we base important decisions in business or healthcare on models whose reasoning we don’t grasp? It’s problematic, to say the least.
In line with the general evolutions in AI, applications of deep learning in Natural Language Processing suffer from this lack of understanding as well. It already starts with the word embeddings that often form the first hidden layer of the network: we typically don’t know what their dimensions stand for. The more hidden layers a model has, the more serious this problem becomes. How do neural networks combine the meanings of single words? How do they bring together information from different parts of a sentence or text? And how do they use all these various sources of information to arrive at their final decision? The complex network architectures often leave us groping in the dark. Luckily more and more researchers are starting to address exactly these questions, and a number of recent articles have put forward methods for improving our understanding of the decisions deep learning models make.
Of course, neural networks aren’t the only machine learning models that are difficult to comprehend. Explaining a complex decision by a Support Vector Machine (SVM), for example, is not exactly child’s play. For precisely this reason, Riberi, Singh and Guestrinhave developed LIME, an explanatory technique that can be applied to a variety of machine learning methods. As it explains how a classifier has come to a given decision, LIME aims to combine interpretability with local fidelity. Its explanations are interpretable because they account for the model’s decision with a small number of single words that have influenced it. They are locally faithful because they correspond well to how the model behaves in the vicinity of the instance under investigation. This is achieved by basing the explanation on sampled instances that are weighted by their similarity to the target example.
链接:
http://nlp.yvespeirsman.be/blog/understanding-deeplearning-models-nlp/
原文链接:
http://weibo.com/1402400261/EsIuGAwNx?from=page_1005051402400261_profile&wvr=6&mod=weibotime&type=comment#_rnd1485526126507