专栏名称: 数盟
数盟(数据科学家联盟)隶属于北京数盟科技有限公司,数盟致力于成为培养与发现“数据科学家”的黄埔军校。 数盟服务包括:线下活动、大数据培训。 官网:http://dataunion.org,合作:[email protected]
人工智能与大数据技术  ·  难蚌,打工人要被AI整破防了——AI技术竟用 ... ·  16 小时前  
大数据分析和人工智能  ·  人到中年才懂:能上班是福气 ·  3 天前  
天池大数据科研平台  ·  谷歌反击,最强Gemini ... ·  2 天前  
大数据文摘  ·  《封神2》崩的越惨,DeepSeek的刀就越锋利。 ·  4 天前  
51好读  ›  专栏  ›  数盟


数盟  · 公众号  · 大数据  · 2017-06-23 22:02


过去几十年来,我们经历了一系列与信息相关的根本性变化与挑战。时至今日,信息的获取已经不再成为瓶颈; 事实上,真正的难题在于如何消化巨大的信息量。相信每位朋友都有这样的切身感受:我们必须阅读更多内容以了解与工作、新闻以及社交媒体相关的热门资讯。为了解决这一挑战,我们开始研究如何利用AI以帮助人们在信息大潮中改善工作体验——而潜在的解决思路之一在于利用算法自动总结篇幅过长的文本内容。








递归神经网络(简称RNN)属于一类深度学习模型,其能够处理可变长度的序列(例如文本序列)并分段计算其中的可用表达(或者隐藏状态)。此类网络能够逐一处理序列中的每项元素(在本示例中为每个单词); 而对于序列中的每条新输入内容,该网络能够将新的隐藏状态作为该输入内容及先前隐藏状态的函数。如此一来,根据各个单词计算得出的隐藏状态都将作为全体单词皆可读取的函数。







为了让我们的模型能够输出更为一致的结果,我们利用所谓时间关注(temporal attention)技术允许解码器在新单词生成时对输出文档内容进行回顾。相较于完全依赖其自有隐藏状态,此解码器能够利用一条关注函数对输入文本内容中的不同部分进行上下文信息联动。该关注函数随后会进行调整,旨在确保模型能够在生成输出文本时使用不同输入内容作为参考,从而提升汇总结果的信息覆盖能力。




要利用新闻文章等实际数据对这套模型进行训练,最为常规的方法在于使用教师强制算法(teacher forcing algorithm):模型利用参考摘要生成一份新摘要,并在其每次生成新单词时进行逐词错误提示(或者称为‘本地监督’,具体如图六所示)。










直到最近,CNN/Daily Mail数据集上的抽象总结最高ROUGE-1得分为35.46。而在我们将监督学习与强化学习相结合训练方案的推动下,我们的解码器内关注递归神经网络模型将该分数提升到了39.87,而纯强化学习训练后得分更是高达41.16。图九所示为其它现有模型与我们这套模型的总结内容得分情况。尽管我们的纯强化学习模型拥有更高的ROUGE得分,但监督学习加强化学习模型在摘要内容的可读性方面仍更胜一筹,这是因为其内容相关度更高。需要注意的是,See et al.采用了另一种不同的数据格式,因此其结果无法直接怀我们乃至其它模型的得分进行直接比较——这里仅将其作为参考。

Nallapati et al. 2016 (抽象) 35.46 32.65
Nallapati et al. 2017 (提取基准) 39.2 35.5
Nallapati et al. 2017 (提取) 39.6 35.3
See et al. 2017 (抽象) 39.53* 36.38*
我们的模型 (仅强化学习) 41.16 39.08
我们的模型 (监督学习+强化学习) 39.87 36.90

图九:CNN/Daily Mail数据集上的内容摘要结果,其中包括我们的模型以及其它几种现有提取与抽象方案。


那么如此大的进步在实际摘要汇总方面到底体现如何?在这里,我们对数据集进行了拆分以生成几段多句式摘要内容。我们的模型及其更为简单的基准设置在利用CNN/Daily Mail数据集训练后得出以下结果。如大家所见,尽管摘要内容已经得到显著改善,但距离完美仍有很长的距离要走。

文章 摘要(参考答案) 摘要(我们的模型)
Google Wallet says it has changed its policy when storing users’ funds as they will now be federally-insured (file photo) For those who use Google Wallet, their money just became safer with federal-level insurance. Google confirmed to Yahoo Finance in a statement that its current policy changed – meaning the company will store the balances for users of the mobile transfer service (similar to PayPal and Venmo) in multiple federally-insured banking institutions. This is good news for people who place large amounts of money in their Wallet Balance because the Federal Deposit Insurance Corporation insures funds for banking institutions up to $250,000. Currently, Google’s user agreement says funds are not protected by the FDIC. However, a Google spokesperson told Yahoo Finance that the current policy has changed. (…) Google spokesperson confirmed current policy changed meaning funds will be protected by the federal deposit insurance corporation. As a non-banking institution, Google Wallet, along with competitors PayPal and Venmo, is not legally required to be federally insured. With the new change to its policy, funds in wallet balance are protected if anything were to happen to the company like bankruptcy. Google confirmed to Yahoo Finance in a statement that its current policy changed. The company will store the balances for users of the mobile transfer service (similar to PayPal and Venmo) in multiple federally-insured banking institutions. Google’s user agreement says funds are not protected by the federal deposit insurance corporation.
Talk about a chain reaction! This is the moment a billiards player performs a complex trick shot by setting up a domino train to pot four balls. Video footage shows a white ball being rolled down a positioned cue. It then bounces off one side of the red-clothed table and hits the first in a long line of dominoes. One by one the small counters fall down, tapping balls into various pockets as they go. First a yellow, then a blue, then a red. Finally, the last domino gently hits an orange ball, causing it to roll down another positioned cue lying on the table. The orb then knocks a green ball into the center pocket. In less than 30 seconds the stunt comes to a close. (…) The clip was uploaded by youtube user honda4ridered. In another upload the skilled billiards player shows viewers how to pocket four balls in a single shot-and for those who miss it there’s a slow motion version. Video footage shows a white ball being rolled down a jumper. It then bounces off one side of the red-clothed table and hits the first in a long line of dominoes. One by one the small counters fall down, tapping balls into pockets as they go-first a yellow. It comes to a close. The clip was uploaded by youtube user honda4ridered.
Kelly Osbourne didn’t always want to grow up to be like her famous mom – but in a letter published in the new book A Letter to My Mom, the TV personality admitted that she is now proud to be Sharon Osbourne’s daughter. For author Lisa Erspamer’s third collection of tributes, celebrities such as Melissa Rivers, Shania Twain, will.i.am, Christy Turlington Burns, and Kristin Chenoweth all composed messages of love and gratitude to the women who raised them. And the heartwarming epistolary book, which was published last week, has arrived just in time for Mother’s Day on May 10. ‘Like all teenage girls I had this ridiculous fear of growing up and becoming just like you,’ Kelly Osbourne wrote in her letter, republished on Yahoo Parenting. ‘I was so ignorant and adamant about creating my “own” identity.’ Scroll down for video Mini-me: In Lisa Erspamer’s new book A Letter to My Mom, Kelly Osbourne (R) wrote a letter to her mother Sharon (L) saying that she’s happy to have grown up to be just like her (…) Author Lisa Erspamer invited celebrities and a number of other people to write heartfelt notes to their mothers for her new book a letter to my mom. Stars such as Melissa Rivers, will.i.am, and Christy Turlington participated in the moving project. Kelly didn’t always want to grow up to be like her famous mom. Lisa Erspamer’s third collection of tributes, celebrities such as Melissa rivers, Shania Twain, will.i.am, Christy Turlington, and Kristin Chenoweth all composed messages of love and gratitude to the women who raised them. Kelly wrote a letter to her mom before Joan’s death last year. She has arrived just in time for Mother’s Day on May 10.




Tony Blair has said he does not want to retire until he is 91 – as he unveiled plans to set up a ‘cadre’ of ex-leaders to advise governments around the world. The defiant 61-year-old former Prime Minister said he had ‘decades’ still in him and joked that he would ‘turn to drink’ if he ever stepped down from his multitude of global roles. He told Newsweek magazine that his latest ambition was to recruit former heads of government to go round the world to advise presidents and prime ministers on how to run their countries. In an interview with the magazine Newsweek Mr Blair said he did not want to retire until he was 91 years old Mr Blair said his latest ambition is to recruit former heads of government to advise presidents and prime ministers on how to run their countries Mr Blair said he himself had been ‘mentored’ by US president Bill Clinton when he took office in 1997. And he said he wanted to build up his organisations, such as his Faith Foundation, so they are ‘capable of changing global policy’. Last night, Tory MPs expressed horror at the prospect of Mr Blair remaining in public life for another 30 years. Andrew Bridgen said: ‘We all know weak Ed Miliband’s called on Tony to give his flailing campaign a boost, but the attention’s clearly gone to his head.’ (…)
