专栏名称: 数盟
数盟(数据科学家联盟)隶属于北京数盟科技有限公司,数盟致力于成为培养与发现“数据科学家”的黄埔军校。 数盟服务包括:线下活动、大数据培训。 官网:http://dataunion.org,合作:[email protected]
目录
相关文章推荐
通用技术集团  ·  破解大型企业健康管理难题!通用技术健管公司( ... ·  17 小时前  
通用技术集团  ·  破解大型企业健康管理难题!通用技术健管公司( ... ·  17 小时前  
大数据文摘  ·  强化学习之父 Rich Sutton ... ·  3 天前  
数据派THU  ·  【AAAI2025】基于全局变换器的模态无关 ... ·  2 天前  
数据派THU  ·  【AAAI2025】偏好导向的监督微调:优先 ... ·  4 天前  
51好读  ›  专栏  ›  数盟

“回归分析”真的算是“机器学习”吗?

数盟  · 公众号  · 大数据  · 2017-06-23 22:02

正文

 过去几十年来,我们经历了一系列与信息相关的根本性变化与挑战。时至今日,信息的获取已经不再成为瓶颈; 事实上,真正的难题在于如何消化巨大的信息量。相信每位朋友都有这样的切身感受:我们必须阅读更多内容以了解与工作、新闻以及社交媒体相关的热门资讯。为了解决这一挑战,我们开始研究如何利用AI以帮助人们在信息大潮中改善工作体验——而潜在的解决思路之一在于利用算法自动总结篇幅过长的文本内容。

然而要训练出这样一套能够产生较长、连续且有意义摘要内容的模型仍是个开放性的研究课题。事实上,即使对于最为先进的深度学习算法而言,生成较长文本内容仍是个难以完成的任务。为了成功完成总结,我们向其中引入了两项独立的重要改进:更多的上下文词汇生成模型以及通过强化学习(简称RL)新方法对汇总模型加以训练。

将这两种训练方法加以结合,意味着整体系统能够将新闻文章等较长文本整理为具有相关性以及高度可读性的多句式摘要内容,且实际效果远优于以往方案。我们的算法能够对不同类型的文本与摘录长度进行训练。在今天的博文中,我们将介绍这套模型的主要突破,同时对自然语言的文本概括相关挑战加以说明。

提取与抽象总结

自动汇总模型的具体实现可采取以下两种方法之一:即提取或者抽象。提取模型执行“复制与粘贴”操作,即选择输入文档中的相关短语并加以连接,借此整理出摘要内容。由于直接使用来自文档之内的现成自然语言表达,因此其功能非常强大——但在另一方面,由于无法使用新的词汇或者连接表达,所以提取模型往往缺乏灵活性。另外,其有时候的表达效果也与人类的习惯有所差异。在另一方面,抽象模型基于具体“抽象”内容生成摘要:其能够完全不使用原始输入文档内的现有词汇。这意味着此类模型能够生成更为流畅且连续的内容,但其实现难度也更高——因为我们需要确保其有能力生成连续的短语与连接表达。

尽管抽象模型在理论上更为强大,但其在实践中也经常犯错误。典型的错误包括在生成的摘要中使用不连续、不相关或者重复的短语,这类问题在尝试创建较长文本输出内容时表现得更为明显。另外,其还往往缺少上下文之间的一致性、济性与可读性。为了解决这些问题,我们显然需要设计出一套更为强大且更具一致性的抽象概括模型。

为了了解我们的这套全新抽象模型,我们需要首先定义其基本构建块,而后讲解我们所采用的新型训练方式。

利用编码器-解码器模型读取并生成文本

递归神经网络(简称RNN)属于一类深度学习模型,其能够处理可变长度的序列(例如文本序列)并分段计算其中的可用表达(或者隐藏状态)。此类网络能够逐一处理序列中的每项元素(在本示例中为每个单词); 而对于序列中的每条新输入内容,该网络能够将新的隐藏状态作为该输入内容及先前隐藏状态的函数。如此一来,根据各个单词计算得出的隐藏状态都将作为全体单词皆可读取的函数。

图二:递归神经网络利用各单词提供的同一函数(绿框)读取输入的句子。

递归神经网络亦可同样的方式用于生成输出序列。在每个步骤当中,递归神经网络的隐藏状态将用于生成一个新的单词,并被添加至最终输出结果内,同时被纳入下一条输入内容中。

图三:递归神经网络能够生成输出序列,同时复用各输出单词作为下一函数的输入内容。

递归神经网络能够利用一套联合模型将输入(读取)与输出(生成)内容加以结合,其中输入递归神经网络的最终隐藏状态将被作为输出递归神经网络的初始隐藏状态。通过这种结合方式,该联合模型将能够读取任意文本并以此为基础生成不同文本信息。这套框架被称为编码器-解码器递归神经网络(亦简称Seq2Seq),并作为我们这套汇总模型的实现基础。另外,我们还将利用一套双向编码器替代传统的编码器递归神经网络,其使用两套不同的递归神经网络读取输入序列:一套从左到右进行文本读取(如图四所示),另一套则从右向左进行读取。这将帮助我们的模型更好地根据上下文对输入内容进行二次表达。

图四:编码器-解码器递归神经网络模型可用于解决自然语言当中的序列到序列处理任务(例如内容汇总)。

新的关注与解码机制

为了让我们的模型能够输出更为一致的结果,我们利用所谓时间关注(temporal attention)技术允许解码器在新单词生成时对输出文档内容进行回顾。相较于完全依赖其自有隐藏状态,此解码器能够利用一条关注函数对输入文本内容中的不同部分进行上下文信息联动。该关注函数随后会进行调整,旨在确保模型能够在生成输出文本时使用不同输入内容作为参考,从而提升汇总结果的信息覆盖能力。

另外,为了确保模型不会发生重复表达,我们还允许其回顾解码器中的原有隐藏状态。在这里,我们定义一条解码器内关注函数以回顾解码器递归神经网络的先前隐藏状态。最后,解码器会将来自时间关注技术的上下文矢量与来自解码器内关注函数的上下文矢量加以结合,共同生成输出结果中的下一个单词。图五所示为特定解码步骤当中这两项关注功能的组合方式。

图五:由编码器隐藏状态与解码器隐藏状态共同计算得出的两条上下文矢量(标记为‘C’)。利用这两条上下文矢量与当前解码器隐藏状态(标记为‘H’)相结合,即可生成一个新的单词(右侧)并将其添加至输出序列当中。

如何训练这套模型?监督学习与强化学习

要利用新闻文章等实际数据对这套模型进行训练,最为常规的方法在于使用教师强制算法(teacher forcing algorithm):模型利用参考摘要生成一份新摘要,并在其每次生成新单词时进行逐词错误提示(或者称为‘本地监督’,具体如图六所示)。

图六:监督学习机制下的模型训练流程。每个生成的单词都会获得一个训练监督信号,具体由将该单词与同一位置的实际摘要词汇进行比较计算得出。

这种方法可用于训练基于递归神经网络的任意序列生成模型,且实际结果相当令人满意。然而,对于我们此次探讨的特定任务,摘要内容并不一定需要逐词进行参考序列匹配以判断其正确与否。可以想象,尽管面对的是同一份新闻文章,但两位编辑仍可能写出完全不同的摘要内容表达——具体包括使用不同的语言风格、用词乃至句子顺序,但二者皆能够很好地完成总结任务。教师强制方法的问题在于,在生成数个单词之后,整个训练过程即会遭受误导:即需要严格遵循正式的总结方式,而无法适应同样正确但却风格不同的起始表达。

考虑到这一点,我们应当在教师强制方法之外找到更好的处理办法。在这里,我们选择了另一种完全不同的训练类型,名为强化学习(简称RL)。首先,强化学习算法要求模型自行生成摘要,而后利用外部记分器来比较所生成摘要与正确参考文本间的差异。这一得分随后会向模型表达其生成的摘要究竟质量如何。如果分数很高,那么该模型即可自我更新以使得此份摘要中的处理方式以更高机率在未来的处理中继续出现。相反,如果得分较低,那么该模型将调整其生成过程以防止继续输出类似的摘要。这种强化学习模型能够极大提升序列整体的评估效果,而非通过逐字分析以评判摘要质量。

图七:在强化学习训练方案当中,模型本身并不会根据每个单词接受本地监督,而是依靠整体输出结果与参考答案间的比照情况给出指导。

如何评估摘要质量?

那么之前提到的记分器到底是什么,它又如何判断摘要内容的实际质量?由于要求人类以手动方式评估数百万条摘要内容几乎不具备任何实践可行性,因此我们需要一种所谓ROUGE(即面向回顾的学习评估)技术。ROUGE通过将所生成摘要中的子短语与参考答案中的子短语进行比较对前者进行评估,且并不要求二者必须完全一致。ROUGE的各类不同变体(包括ROUGE-1、ROUGE-2以及ROUGE-L)都采用同样的工作原理,但具体使用的子序列长度则有所区别。

尽管ROUGE给出的分数在很大程度上趋近于人类的主观判断,但ROUGE给出最高得分的摘要结果却不一定具有最好的可读性或者顺畅度。在我们对模型进行训练时,单独使用强化学习训练将使得ROUGE最大化成为一种硬性要求,而这无疑会带来新的问题。事实上,在对ROUGE得分最高的摘要结果时,我们发现其中一部分内容几乎完全不具备可读性。

为了发挥二者的优势,我们的模型同时利用教师强制与强化学习两种方式进行训练,希望借此通过单词级监督与全面引导最大程度提升总结内容的一致性与可读性。具体来讲,我们发现ROUGE优化型强化学习机制能够显著提升强调能力(即确保囊括一切重要信息),而单词层级的监督学习则有助于改善语言流畅度,最终令输出内容更连续、更可读。

图八:监督学习(红色箭头)与强化学习(紫色箭头)相结合,可以看到我们的模型如何同时利用本地与全局回馈的方式优化可读性与整体ROUGE分数。

直到最近,CNN/Daily Mail数据集上的抽象总结最高ROUGE-1得分为35.46。而在我们将监督学习与强化学习相结合训练方案的推动下,我们的解码器内关注递归神经网络模型将该分数提升到了39.87,而纯强化学习训练后得分更是高达41.16。图九所示为其它现有模型与我们这套模型的总结内容得分情况。尽管我们的纯强化学习模型拥有更高的ROUGE得分,但监督学习加强化学习模型在摘要内容的可读性方面仍更胜一筹,这是因为其内容相关度更高。需要注意的是,See et al.采用了另一种不同的数据格式,因此其结果无法直接怀我们乃至其它模型的得分进行直接比较——这里仅将其作为参考。

模型ROUGE-1ROUGE-L
Nallapati et al. 2016 (抽象)35.4632.65
Nallapati et al. 2017 (提取基准)39.235.5
Nallapati et al. 2017 (提取)39.635.3
See et al. 2017 (抽象)39.53*36.38*
我们的模型 (仅强化学习)41.1639.08
我们的模型 (监督学习+强化学习)39.8736.90

图九:CNN/Daily Mail数据集上的内容摘要结果,其中包括我们的模型以及其它几种现有提取与抽象方案。

输出结果示例

那么如此大的进步在实际摘要汇总方面到底体现如何?在这里,我们对数据集进行了拆分以生成几段多句式摘要内容。我们的模型及其更为简单的基准设置在利用CNN/Daily Mail数据集训练后得出以下结果。如大家所见,尽管摘要内容已经得到显著改善,但距离完美仍有很长的距离要走。

文章摘要(参考答案)摘要(我们的模型)
Google Wallet says it has changed its policy when storing users’ funds as they will now be federally-insured (file photo) For those who use Google Wallet, their money just became safer with federal-level insurance. Google confirmed to Yahoo Finance in a statement that its current policy changed – meaning the company will store the balances for users of the mobile transfer service (similar to PayPal and Venmo) in multiple federally-insured banking institutions. This is good news for people who place large amounts of money in their Wallet Balance because the Federal Deposit Insurance Corporation insures funds for banking institutions up to $250,000. Currently, Google’s user agreement says funds are not protected by the FDIC. However, a Google spokesperson told Yahoo Finance that the current policy has changed. (…)Google spokesperson confirmed current policy changed meaning funds will be protected by the federal deposit insurance corporation. As a non-banking institution, Google Wallet, along with competitors PayPal and Venmo, is not legally required to be federally insured. With the new change to its policy, funds in wallet balance are protected if anything were to happen to the company like bankruptcy.Google confirmed to Yahoo Finance in a statement that its current policy changed. The company will store the balances for users of the mobile transfer service (similar to PayPal and Venmo) in multiple federally-insured banking institutions. Google’s user agreement says funds are not protected by the federal deposit insurance corporation.
Talk about a chain reaction! This is the moment a billiards player performs a complex trick shot by setting up a domino train to pot four balls. Video footage shows a white ball being rolled down a positioned cue. It then bounces off one side of the red-clothed table and hits the first in a long line of dominoes. One by one the small counters fall down, tapping balls into various pockets as they go. First a yellow, then a blue, then a red. Finally, the last domino gently hits an orange ball, causing it to roll down another positioned cue lying on the table. The orb then knocks a green ball into the center pocket. In less than 30 seconds the stunt comes to a close. (…)The clip was uploaded by youtube user honda4ridered. In another upload the skilled billiards player shows viewers how to pocket four balls in a single shot-and for those who miss it there’s a slow motion version.Video footage shows a white ball being rolled down a jumper. It then bounces off one side of the red-clothed table and hits the first in a long line of dominoes. One by one the small counters fall down, tapping balls into pockets as they go-first a yellow. It comes to a close. The clip was uploaded by youtube user honda4ridered.
Kelly Osbourne didn’t always want to grow up to be like her famous mom – but in a letter published in the new book A Letter to My Mom, the TV personality admitted that she is now proud to be Sharon Osbourne’s daughter. For author Lisa Erspamer’s third collection of tributes, celebrities such as Melissa Rivers, Shania Twain, will.i.am, Christy Turlington Burns, and Kristin Chenoweth all composed messages of love and gratitude to the women who raised them. And the heartwarming epistolary book, which was published last week, has arrived just in time for Mother’s Day on May 10. ‘Like all teenage girls I had this ridiculous fear of growing up and becoming just like you,’ Kelly Osbourne wrote in her letter, republished on Yahoo Parenting. ‘I was so ignorant and adamant about creating my “own” identity.’ Scroll down for video Mini-me: In Lisa Erspamer’s new book A Letter to My Mom, Kelly Osbourne (R) wrote a letter to her mother Sharon (L) saying that she’s happy to have grown up to be just like her (…)Author Lisa Erspamer invited celebrities and a number of other people to write heartfelt notes to their mothers for her new book a letter to my mom. Stars such as Melissa Rivers, will.i.am, and Christy Turlington participated in the moving project.Kelly didn’t always want to grow up to be like her famous mom. Lisa Erspamer’s third collection of tributes, celebrities such as Melissa rivers, Shania Twain, will.i.am, Christy Turlington, and Kristin Chenoweth all composed messages of love and gratitude to the women who raised them. Kelly wrote a letter to her mom before Joan’s death last year. She has arrived just in time for Mother’s Day on May 10.

图十:这里为我们的模型生成的更多摘要示例,并与由人类为同篇文章撰写的摘要进行了比较。

为了说明我们的方案对于文本内容概括的重要改进效果,图十一所示为我们在移除关注内与强化学习训练之后的生成结果。

文章

Tony Blair has said he does not want to retire until he is 91 – as he unveiled plans to set up a ‘cadre’ of ex-leaders to advise governments around the world. The defiant 61-year-old former Prime Minister said he had ‘decades’ still in him and joked that he would ‘turn to drink’ if he ever stepped down from his multitude of global roles. He told Newsweek magazine that his latest ambition was to recruit former heads of government to go round the world to advise presidents and prime ministers on how to run their countries. In an interview with the magazine Newsweek Mr Blair said he did not want to retire until he was 91 years old Mr Blair said his latest ambition is to recruit former heads of government to advise presidents and prime ministers on how to run their countries Mr Blair said he himself had been ‘mentored’ by US president Bill Clinton when he took office in 1997. And he said he wanted to build up his organisations, such as his Faith Foundation, so they are ‘capable of changing global policy’. Last night, Tory MPs expressed horror at the prospect of Mr Blair remaining in public life for another 30 years. Andrew Bridgen said: ‘We all know weak Ed Miliband’s called on Tony to give his flailing campaign a boost, but the attention’s clearly gone to his head.’ (…)

摘要(由人类撰写的参考答案)

The former Prime Minister claimed he has ‘decades’ of work left in him. Joked he would ‘turn to drink’ if he ever stepped down from global roles. Wants to recruit former government heads to advise current leaders. He was ‘mentored’ by US president Bill Clinton when he started in 1997.

摘要(我们的模型)

Blair said he did not want to retire until he was 91 years old. 61-year-old former prime minister said he would ‘turn to drink’ if he ever stepped down from his own. He said he wanted to build up his charity to advise presidents and prime ministers on how to run their countries. Mr Blair says he is to recruit former heads of government to go round the world to advise ministers. He says he wants to emulate ex-Israeli president Shimon Peres.

摘要(排除关注内与强化学习训练)

61-year-old former prime minister said he did not want to retire until he was 91 years old. He said he wanted to build up his organisations, such as his Faith Foundation. He said he wanted to emulate ex-Israeli president Shimon Peres. Mr Blair said he wanted to emulate ex-Israeli President Shimon Peres. He said he wanted to be seeing someone when he took office in 1997. Mr Blair said he wanted to be seeing someone when he took office in 1997. Mr Blair said he wanted to

图十一:将我们的模型所生成的摘要示例与移除改进机制后生成的结果进行比较。原始文档当中并未出现的新词汇以绿色形式标记。摘要中出现的重复性短语以红色形式标记。

总结

我们的模型显著改善了多句式文本摘要生成中所使用的各类先进技术,且实际结果优于现有抽象模型与提取模型的基准水平。我们相信,我们所作出的解码器内关注模块与复合型训练目标贡献亦能够改善其它序列生成任务,特别是在长文本输出场景之下。

我们的工作亦涉及诸如ROUGE等自动评估指标的限制,根据结果来看,理想的指标确实能够较好地评估并优化内容摘要模型。理想的指标应与人类拥有基本一致的判断标准,具体包括摘要内容的一致性与可读性等方面。当我们利用此类度量标准对总结模型进行改进时,其结果的质量应该能够得到进一步提升。

引用提示

如果您希望在发行物中引用此篇博文,请注明:

Romain Paulus、Caiming Xiong以及Richard Socher。2017年。

《一套用于抽象概括的深度强化模型》

致谢

这里要特别感谢Melvin Gruesbeck为本文提供的图像与统计数字。

原文链接:

https://metamind.io/research/your-tldr-by-an-ai-a-deep-reinforced-model-for-abstractive-summarization

文章来源:51CTO





媒体合作请联系:

邮箱:[email protected]