专栏名称: 机器学习研究会
机器学习研究会是北京大学大数据与机器学习创新中心旗下的学生组织,旨在构建一个机器学习从事者交流的平台。除了及时分享领域资讯外,协会还会举办各种业界巨头/学术神牛讲座、学术大牛沙龙分享会、real data 创新竞赛等活动。
目录
相关文章推荐
新智元  ·  SB OpenAI ... ·  11 小时前  
宝玉xp  ·  //@爱水de鱼儿:好答案来自于好问题 ... ·  12 小时前  
宝玉xp  ·  Deep Research ... ·  20 小时前  
宝玉xp  ·  你说的对,学术领域是挺好的 网页链接 ... ·  20 小时前  
量子位  ·  首个OpenAI免费推理模型o3-mini发 ... ·  3 天前  
51好读  ›  专栏  ›  机器学习研究会

【学习】Facebook 开源了问答系统 DrQA

机器学习研究会  · 公众号  · AI  · 2017-07-27 21:18

正文



点击上方“机器学习研究会”可以订阅哦


摘要
 

转自:ruanyf

Facebook 开源了问答系统 DrQA,自动分析维基百科,回答用户的各种问题

This is a PyTorch implementation of the DrQA system described in the ACL 2017 paper Reading Wikipedia to Answer Open-Domain Questions.


Machine Reading at Scale

DrQA is a system for reading comprehension applied to open-domain question answering. In particular, DrQA is targeted at the task of "machine reading at scale" (MRS). In this setting, we are searching for an answer to a question in a potentially very large corpus of unstructured documents (that may not be redundant). Thus the system has to combine the challenges of document retrieval (finding the relevant documents) with that of machine comprehension of text (identifying the answers from those documents).

Our experiments with DrQA focus on answering factoid questions while using Wikipedia as the unique knowledge source for documents. Wikipedia is a well-suited source of large-scale, rich, detailed information. In order to answer any question, one must first retrieve the few potentially relevant articles among more than 5 million, and then scan them carefully to identify the answer.

Note that DrQA treats Wikipedia as a generic collection of articles and does not rely on its internal graph structure. As a result, DrQA can be straightforwardly applied to any collection of documents, as described in the retriever README.

This repository includes code, data, and pre-trained models for processing and querying Wikipedia as described in the paper -- see Trained Models and Data. We also list several different datasets for evaluation, see QA Datasets. Note that this work is a refactored and more efficient version of the original code. Reproduction numbers are very similar but not exact.


链接:

https://github.com/facebookresearch/DrQA


原文链接:

https://m.weibo.cn/1400854834/4134162606425092

“完整内容”请点击【阅读原文】
↓↓↓