“Liar,Liar Pants on Fire”:
ANew Benchmark Dataset for Fake News Detection
William Yang Wang
Department of Computer ScienceUniversity of California, Santa Barbara Santa Barbara, CA 93106 USA [email protected]
Abstract
Automatic fake news detection is a chal-
lenging problem in deception detection,
and it has tremendous real-world politi-
cal and social impacts. However, statis-
tical approaches to combating fake news
has been dramatically limited by the lack
of labeled benchmark datasets. In this
paper, we present LIAR: a new, publicly
available dataset for fake news detection.
We collected a decade-long, 12.8K man-
ually labeled short statements in various
contexts from POLITIFACT.COM, which
provides detailed analysis report and links
to source documents for each case. This
dataset can be used for fact-checking re-
search as well. Notably, this new dataset
is an order of magnitude larger than pre-
viously largest public fake news datasets
of similar type. Empirically, we investi-
gate automatic fake news detection based
on surface-level linguistic patterns. We
have designed a novel, hybrid convolu-
tional neural network to integrate meta-
data with text. We show that this hybrid
approach can improve a text-only deep
learning model.
ACL 2017的短文本假新闻检测数据集及论文
链接:
https://www.cs.ucsb.edu/~william/papers/acl2017.pdf