专栏名称: 亚马逊云科技
亚马逊AWS的公众号,第一时间获取关于AWS国内外服务的资讯。AWS中国(北京)区域由光环新网运营。
目录
相关文章推荐
LRTV辽宁之声  ·  沈阳地铁6号线最新消息! ·  昨天  
新北方  ·  机会来了!清华、北大等多校官宣扩招! ·  2 天前  
51好读  ›  专栏  ›  亚马逊云科技

NeurIPS 2024|论文精选,引领人工智能探索之旅

亚马逊云科技  · 公众号  ·  · 2024-12-28 11:00

正文


近期,人工智能领域的顶级会议——2024年神经信息处理系统会议(NeurIPS)拉开帷幕。 会上收录的亚马逊论文,展示了其在人工智能研究领域的广泛性。


近年来,大语言模型(LLM)和其他基础模型在该领域占据主导地位, 亚马逊的论文反映了这一趋势,涵盖检索增强生成、使用大语言模型进行代码生成、常识推理和多模态模型等主题 。训练方法也是备受关注的领域,相关论文涉及了内存高效训练、基于人类反馈的强化学习、拒绝分类以及Transformer模型的收敛速度等。


亚马逊的论文也展示了对多臂老虎机问题(亚马逊长期向NeurIPS提交论文的主要内容)和语音处理等主题的持续 关注,还有诸如将机器学习应用于科学计算和自动推理等新领域。一篇题为《B’MOJO: Hybrid state space realizations o f foundation models with eidetic and fading memory》的论文,就提出了一种基于转导学习概念的机器学习新范式。




自动推理



论文标题:

Neural model checking

论文作者:

Mirco Giacobbe、Daniel Kroening、Abhinandan Pal、Michael Tautschnig

论文地址:

https://www.amazon.science/publications/neural-model-checking




多臂老虎机问题




论文标题:

Adaptive experimentation when you can’t experiment

论文作者:

Yao Zhao、Kwang-Sung Jun、Tanner Fiez、Lalit Jain

论文地址:

https://www.amazon.science/publications/adaptive-experimentation-when-you-cant-experiment




论文标题:

Online posterior sampling with a diffusion prior

论文作者:

Branislav Kveton、Boris Oreshkin、Youngsuk Park、Aniket Deshmukh、Rui Song

论文地址:

https://www.amazon.science/publications/online-posterior-sampling-with-a-diffusion-prior




代码生成




论文标题:

Training LLMs to better self-debug and explain code

论文作者:

Nan Jiang、Xiaopeng LI、Shiqi Wang、Qiang Zhou、Baishakhi Ray、Varun Kumar、Xiaofei Ma、Anoop Deoras

论文地址:

https://www.amazon.science/publications/training-llms-to-better-self-debug-and-explain-code


上图为论文中提出的数据收集和模型训练框架




常识推理




论文标题:

Can language models learn to skip steps?

论文作者:

Tengxiao Liu、Qipeng Guo、Xiangkun Hu、Jiayang Cheng、Yue Zhang、Xipeng Qiu、Zheng Zhang

论文地址:

https://www.amazon.science/publications/can-language-models-learn-to-skip-steps




计算流体力学




论文标题:

WindsorML:High-fidelity computational fluid dynamics dataset for automotive aerodynamics

论文作者:

Neil Ashton、Jordan B. Angel、Aditya S. Ghate、Gaetan K. W. Kenway、Man Long Wong、Cetin Kiris、Astrid Walle、Danielle Maddix Robinson、Gary Page

论文地址:

https://www.amazon.science/publications/windsorml-high-fidelity-computational-fluid-dynamics-dataset-for-automotive-aerodynamics




大语言模型评估




论文标题:

SetLexSem Challenge: Using set operations to evaluate the lexical and semantic robustness of language models

论文作者:

Bardiya Akhbari、Manish Gawali、Nicholas Dronen

论文地址:

https://www.amazon.science/publications/setlexsem-challenge-using-set-operations-to-evaluate-the-lexical-and-semantic-robustness-of-language-models



为了评估大语言模型对集合成员语义变化的鲁棒性,亚马逊研究人员及其同事通过对上位词对(例如“哺乳动物”和“车辆”)进行采样,创建了“欺骗性”集合,并从中提取三种不同条件下的下位词:


  • 与抽样的下位词相同;

  • 更换半数成员;

  • 随机抽样。


大语言模型在第二个条件(交换)下表现出独特的故障模式,第一个条件(未交换)的准确率均值和方差优于随机基线。上图可在该论文中找到。




内存管理




论文标题:

Online weighted paging with unknown weights

论文作者:

Orin Levy、Aviv Rosenberg、Noam Touitou

论文地址:

https://www.amazon.science/publications/online-weighted-paging-with-unknown-weights




模型架构




论文标题:

B’MOJO:Hybrid state space realizations of foundation models with eidetic and fading memory

论文作者:

Luca Zancato、Arjun Seshadri、Yonatan Dukler、Aditya Golatkar、Yantao Shen、Ben Bowman、Matthew Trager、Alessandro Achille、Stefano Soatto

论文地址:

https://www.amazon.science/publications/bmojo-hybrid-state-space-realizations-of-foundation-models-with-eidetic-and-fading-memory




隐 私




论文标题:

Pre-training differentially private models with limited public data

论文作者:

Zhiqi Bu、Xinwei Zhang、Sheng Zha、Mingyi Hong

论文地址:

https://www.amazon.science/publications/pre-training-differentially-private-models-with-limited-public-data




论文标题:

Reconstruction attacks on machine unlearning: Simple models are vulnerable

论文作者:

Martin Bertran Lopez、Shuai Tang、Michael Kearns、Jamie Morgenstern、Aaron Roth、Zhiwei Steven Wu

论文地址:

https://www.amazon.science/publications/reconstruction-attacks-on-machine-unlearning-simple-models-are-vulnerable




检索增强生成(RAG)




论文标题:

RAGChecker:A fine-grained framework for diagnosing retrieval-augmented generation

论文作者:

Dongyu Ru、Lin Qiu、Xiangkun Hu、Tianhang Zhang、Peng Shi、Shuaichen Chang、Cheng Jiayang、Cunxiang Wang、Shichao Sun、Huanyu Li、Zizhao Zhang、Binjie Wang、Jiarong Jiang、Tong He、Zhiguo Wang、Pengfei Liu、Yue Zhang、Zheng Zhang

论文地址:

https://www.amazon.science/publications/ragchecker-a-fine-grained-framework-for-diagnosing-retrieval-augmented-generation




语音处理




论文标题:

CA-SSLR:Condition-aware self-supervised learning representation for generalized speech processing

论文作者:

Yen-Ju Lu、Jing Liu、Thomas Thebaud、Laureano Moro-Velazquez、Ariya Rastrow、Najim Dehak、Jesus Villalba

论文地址:

https://www.amazon.science/publications/ca-sslr-condition-aware-self-supervised-learning-representation-for-generalized-speech-processing



如论文提及,CA-SSLR方案及其时间通道注意力调节器,只有解码器的调节器和线性投影是可训练的,所有其他参数在适应过程中都被冻结。CA-SSLR通过集成中间LID/SV条件、保持预训练参数冻结(左)来改进SSL功能。可训练的时间通道注意力调节器集成了语言和发言人预测(右)。




训练方法




论文标题:

CoMERA:Computing- and memory-efficient training via rank-adaptive tensor optimization

论文作者:

Zi Yang、Ziyue Liu、Samridhi Choudhary、Xinfeng Xie、Cao Gao、Siegfried Kunzmann、Zheng Zhang

论文地址:

https://www.amazon.science/publications/comera-computing-and-memory-efficient-training-via-rank-adaptive-tensor-optimization




论文标题:

Optimal design for human preference elicitation

论文作者:

Subhojyoti Mukherjee、Anusha Lalitha、Kousha Kalantari、Aniket Deshmukh、Ge Liu、Yifei Ma、Branislav Kveton

论文地址:

https://www.amazon.science/publications/optimal-design-for-human-preference-elicitation




论文标题:

Rejection via learning density ratios

论文作者:

Alexander Soen、Hisham Husain、Philip Schulz、Vu Nguyen

论文地址:

https://www.amazon.science/publications/rejection-via-learning-density-ratios




论文标题:

Unraveling the gradient descent dynamics of transformers

论文作者:

Bingqing Song、Boran Han、Shuai Zhang、Jie Ding、Mingyi Hong

论文地址:

https://www.amazon.science/publications/unraveling-the-gradient-descent-dynamics-of-transformers








请到「今天看啥」查看全文