大语言模型在科学研究中的应用

走天涯徐小洋地理数据科学 · 公众号 · · 2025-01-31 20:14

正文

点击下面卡片，快速关注本公众号

科学研究是一个严谨的系统性探索过程。传统模式下，研究人员首先收集背景知识、提出假设，然后设计并执行实验、收集和分析数据，最终通过同行评议的手稿报告研究结果。尽管这一循环过程推动了现代科学技术的进步，但其依然受到人类研究者的创造力、专业知识以及有限的时间和资源的制约。数十年来，科学界一直致力于通过自动化科学研究的某些方面来提高科学家的生产力。早期的计算机辅助研究可以追溯到20世纪70年代，例如“自动化数学家”和BACON系统，它们展示了机器在定理生成和经验定律识别等特定研究任务中的潜力。近年来，AlphaFold和OpenFold等系统成为自动化特定研究任务的先锋，显著加速了各自领域的研究进展。然而，只有随着基础模型的出现和大型语言模型的爆发式发展，在多个研究领域实现全面AI辅助的愿景才变得现实。LLMs，如GPT-4和LLaMA，在理解、生成和交互人类语言方面树立了新的标杆。其强大的能力，得益于海量数据集和创新架构，使其应用范围超越了传统的自然语言处理任务，扩展到更复杂和特定领域的挑战。尤其值得注意的是，LLMs处理海量数据、生成类人文本和辅助复杂决策的能力引起了科学界的广泛关注，预示着LLMs有潜力彻底改变科学研究的开展、记录和评估方式。

近年来，大型语言模型（LLMs）的快速发展深刻地改变了科学研究的格局，为研究周期的各个阶段提供了前所未有的支持。该论文作为首个专门探讨LLMs如何革新科学研究过程的系统性综述，深入分析了LLMs在科学研究的四个关键阶段所扮演的独特角色：科学假说发现、实验规划与实施、科学写作以及同行评议。该综述文章全面展示了针对特定任务的方法论和评估基准，并通过识别当前挑战和提出未来研究方向，不仅突出了LLMs的变革潜力，也旨在启发和指导研究人员和实践者利用LLMs来推进科学探索。

论文链接：https://arxiv.org/abs/2501.04306

以下是大语言模型在科学假说发现、实验规划与实施、科学写作以及同行评议等四个方面的论文、代码、软件工具等资源链接。

LLMs for Scientific Hypothesis Discovery

SciMON SciMON: Scientific Inspiration Machines Optimized for Novelty (May. 23, 2023; ACL 2024) ( https://arxiv.org/abs/2305.14259 )
MOOSE Large Language Models for Automated Open-domain Scientific Hypotheses Discovery (Sep. 6, 2023; ICML AI4Science Workshop Best Poster Award; ACL 2024) ( https://arxiv.org/abs/2309.02726 )
MCR Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design (Oct. 22, 2023; EMNLP 2023) ( https://arxiv.org/abs/2310.14420 )
Large language models are zero shot hypothesis proposers (Nov. 10, 2023; COLM 2024) ( https://arxiv.org/abs/2311.05965 )
FunSearch Mathematical discoveries from program search with large language models (Dec. 14, 2023; Nature) ( https://www.nature.com/articles/s41586-023-06924-6 )
ChemReasoner ChemReasoner: Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback (Feb. 15, 2024; ICML 2024) ( https://arxiv.org/abs/2402.10980 )
SGA LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery (May. 16, 2024; ICML 2024) ( https://arxiv.org/abs/2405.09783 )
AIScientist The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (Aug. 12, 2024) ( https://arxiv.org/abs/2408.06292 )
MLR-Copilot MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents (Aug. 26, 2024) ( https://arxiv.org/abs/2408.14033 )
IGA Can llms generate novel research ideas? a large-scale human study with 100+ nlp researchers (Sep. 6, 2024) ( https://arxiv.org/abs/2409.04109 )
SciAgents SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning (Sep. 9, 2024) ( https://arxiv.org/abs/2409.05556 )
Scideator Scideator: Human-LLM Scientific Idea Generation Grounded in Research-Paper Facet Recombination (Sep. 23, 2024) ( https://arxiv.org/abs/2409.14634 )
MOOSE-Chem MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses (Oct. 9, 2024) ( https://arxiv.org/abs/2410.07076 )
VirSci Two Heads Are Better Than One: A Multi-Agent System Has the Potential to Improve Scientific Idea Generation (Oct. 12, 2024) ( https://arxiv.org/abs/2410.09403 )
CoI Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents (Oct. 17, 2024) ( https://arxiv.org/abs/2410.13185 )
Nova Nova: An Iterative Planning and Search Approach to Enhance Novelty and Diversity of LLM Generated Ideas (Oct. 18, 2024) ( https://arxiv.org/abs/2410.14255 )

LLMs for Experiment Planning and Implementation

Optimizing Experimental Design

Coscientist Autonomous chemical research with large language models (Dec. 20, 2023) ( https://doi.org/10.1038/s41586-023-06792-0 )
ChemCrow Augmenting large language models with chemistry tools (May. 08, 2024) ( https://doi.org/10.1038/S42256-024-00832-8[ )
CRISPR-GPT CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments (Arp. 27, 2024) ( https://doi.org/10.48550/ARXIV.2404.18021 )
Navigating Complexity Navigating Complexity: Orchestrated Problem Solving with Multi-Agent LLMs (Jul. 10, 2024) ( https://arxiv.org/abs/2402.16713 )
HuggingGPT HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face (Dec. 03, 2024) ( https://arxiv.org/abs/2303.17580 )
AutoGen AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework (Oct. 03, 2023) ( https://doi.org/10.48550/ARXIV.2308.08155 )
LLM-RDF An automatic end-to-end chemical synthesis development platform powered by large language models (Nov. 23, 2024) ( https://www.nature.com/articles/s41467-024-54457-x )
Simulating Expert Discussions with Multi-agent for Enhanced Scientific Problem Solving (Jan. 23, 2024) ( https://aclanthology.org/2024.sdp-1.23/ )

Automating Experimental Processes

Data Preparation

Data-Juicer Data-Juicer: A One-Stop Data Processing System for Large Language Models (Dec. 20, 2023) ( https://arxiv.org/abs/2309.02033 )
Jellyfish Jellyfish: A Large Language Model for Data Preprocessing (Oct. 28, 2024) ( https://arxiv.org/abs/2312.01678 )
Can Large Language Models Transform Computational Social Science? (Feb. 26, 2024) (hhttps://arxiv.org/abs/2305.03514)
CAAFE Large Language Models for Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering (Sep. 28, 2023) ( https://arxiv.org/abs/2305.03403 )
Are you in a Masquerade? Exploring the Behavior and Impact of Large Language Model Driven Social Bots in Online Social Networks. (Jun. 19, 2023) ( https://arxiv.org/abs/2307.10337 )
Training Socially Aligned Language Models in Simulated Human Society (Oct. 28, 2023) ( https://arxiv.org/abs/2305.16960 )

Experiment Execution and Workflow Automation

ESM-1b Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences (Dec. 16, 2020) ( https://www.pnas.org/doi/full/10.1073/pnas.2016239118 )
ESM-2 Evolutionary-scale prediction of atomic-level protein structure with a language model (Mar. 16, 2023) ( https://www.science.org/doi/10.1126/science.ade2574 )
Controllable protein design with language models (Aug. 22, 2022) ( https://arxiv.org/abs/2201.07338 )
PALM-H3 De novo generation of SARS-CoV-2 antibody CDRH3 with a pre-trained generative large language model (Aug. 10, 2024) ( https://www.nature.com/articles/s41467-024-50903-y )
Coscientist Autonomous chemical research with large language models (Dec. 20, 2023) ( https://doi.org/10.1038/s41586-023-06792-0 )
ChemCrow Augmenting large language models with chemistry tools (May. 08, 2024) ( https://doi.org/10.1038/S42256-024-00832-8 )
Efficient Evolutionary Search Over Chemical Space with Large Language Models (Jul. 02 2024) ( https://arxiv.org/abs/2406.16976 )
ChatDrug Conversational Drug Editing Using Retrieval and Domain Feedback (May. 29, 2023) ( https://arxiv.org/abs/2305.18090 )
DrugAssist DrugAssist: A Large Language Model for Molecule Optimization (Dec. 28, 2023) ( https://arxiv.org/abs/2401.10334 )
Bayesian Optimization of Catalysts With In-context Learning (Apr. 18, 2024) ( https://arxiv.org/abs/2304.05341 )
Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design (Oct. 22, 2023) ( https://arxiv.org/abs/2310.14420 )
ChemReasoner CHEMREASONER: Heuristic Search over a Large Language Model’s Knowledge Space using Quantum-Chemical Feedback (Dec. 09, 2024) ( https://arxiv.org/abs/2402.10980 )

Data Analysis and Interpretation

Automated Statistical Model Discovery with Language Models (Jun. 22, 2024) ( https://arxiv.org/abs/2402.17879 )
MentaLLaMA MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models (Feb. 04, 2024) ( https://arxiv.org/abs/2309.13567 )
Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis (Feb. 02, 2024) ( https://arxiv.org/abs/2402.01386 )
Opening a conversation on responsible environmental data science in the age of large language models (May. 09, 2024) ( https://www.cambridge.org/core/journals/environmental-data-science/article/opening-a-conversation-on-responsible-environmental-data-science-in-the-age-of-large-language-models/95FD09526541A19436F3A18ADE332953 )
DSBench DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? (Sep. 12, 2024) ( https://arxiv.org/abs/2409.07703 )
AutoGen AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework (Oct. 03, 2023) ( https://doi.org/10.48550/ARXIV.2308.08155 )
LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis (Oct. 23, 2023) ( https://aclanthology.org/2023.findings-emnlp.669/ )

Benchmarks

SUPER SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories (Sep. 11, 2024) (( https://arxiv.org/abs/2409.07440 )
MLE-bench MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering (Dec. 20, 2023) ( https://arxiv.org/abs/2410.07095 )
ScienceAgentBench ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery (Oct. 07, 2024) ( https://arxiv.org/abs/2410.05080 )
Spider2-V Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? (Jul. 15, 2024) ( https://arxiv.org/abs/2407.10956 )
MLAgentBench MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation (Oct. 05, 2023) ( https://arxiv.org/abs/2310.03302 )
DiscoveryWorld DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents (Jun. 10, 2024) ( https://arxiv.org/abs/2406.06769 )
DSBench DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? (Sep. 12, 2024) ( https://arxiv.org/abs/2409.07703 )
DS-1000 DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation (Nov. 18, 2022) ( https://arxiv.org/abs/2211.11501 )
LAB-Bench LAB-Bench: Measuring Capabilities of Language Models for Biology Research (Jul. 14, 2024) ( https://arxiv.org/abs/2407.10362 )
AgentBench AgentBench: Evaluating LLMs as Agents (Aug. 07, 2023) ( https://arxiv.org/abs/2308.03688 )
TaskBench TaskBench: Benchmarking Large Language Models for Task Automation (Nov. 30, 2023) ( https://arxiv.org/abs/2311.18760 )
CORE-Bench CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark (Sep. 17, 2024) ( https://arxiv.org/abs/2409.11363 )

LLMs for Scientific Paper Writing

Citation Text Generation

Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study (July. 30, 2020) ( https://aclanthology.org/2020.acl-main.550/ )
Explaining Relationships Among Research Papers (Feb. 20, 2024) ( https://arxiv.org/abs/2402.13426 )

AutoCite AutoCite: Multi-Modal Representation Fusion for Contextual Citation Generation (Mar. 08, 2021) ( https://dl.acm.org/doi/10.1145/3437963.3441739 )
BACO BACO: A Background Knowledge- and Content-Based Framework for Citing Sentence Generation (Aug. 1, 2021) ( https://aclanthology.org/2021.acl-long.116/ )
Controllable Citation Sentence Generation with Language Models (Nov. 14, 2022) ( https://arxiv.org/abs/2211.07066 )
Intent-Controllable Citation Text Generation (May. 21, 2022) ( https://www.mdpi.com/2227-7390/10/10/1763 )

Related Work Generation

Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition (Feb. 19, 2024) ( https://arxiv.org/abs/2402.12255 )
Leveraging Large Language Models for Literature Review Tasks - A Case Study Using ChatGPT (Dec. 20, 2023) ( https://link.springer.com/chapter/10.1007/978-3-031-48858-0_25 )
LitLLM LitLLM: A Toolkit for Scientific Literature Review (Fe. 02, 2024) ( https://arxiv.org/abs/2402.01788 )
HiReview HiReview: Hierarchical Taxonomy-Driven Automatic Literature Review Generation (Oct. 02, 2024) ( https://arxiv.org/abs/2410.03761 )
Towards a Unified Framework for Reference Retrieval and Related Work Generation (Dec. 06, 2023) ( https://aclanthology.org/2023.findings-emnlp.385 )
Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning (Apr. 08, 2024) ( https://arxiv.org/abs/2404.08680 )
Reinforced Subject-Aware Graph Neural Network for Related Work Generation (Jul. 26, 2024) ( https://link.springer.com/chapter/10.1007/978-981-97-5492-2_16 )
Toward Structured Related Work Generation with Novelty Statements (Jul. 26, 2024) ( https://aclanthology.org/2024.sdp-1.5 )

Drafting andWriting

Generating Scientific Definitions with Controllable Complexity (May. 22, 2022) ( https://aclanthology.org/2022.acl-long.569 )
SciCap SciCap: Generating Captions for Scientific Figures (Nov. 07, 2021) ( https://aclanthology.org/2021.findings-emnlp.277 )
CoAuthor CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities (Apr. 29, 2022) ( https://arxiv.org/abs/2407.02352 )
Autonomous LLM-driven research from data to human-verifiable research papers (Apr. 24, 2024) ( https://arxiv.org/abs/2404.17605 )
PaperRobot PaperRobot: Incremental Draft Generation of Scientific Ideas (Jun. 28, 2019) ( https://aclanthology.org/P19-1191 )
AutoSurvey AutoSurvey: Large Language Models Can Automatically Write Surveys (Jun. 10, 2024) ( https://arxiv.org/abs/2406.10252 )
AI Scientist The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (Aug. 12, 2024) ( https://arxiv.org/abs/2408.06292 )
CycleResearcher CycleResearcher: Improving Automated Research via Automated Review (Oct. 28, 2024) ( https://arxiv.org/abs/2411.00816 )

Benchmarks

Enabling Large Language Models to Generate Text with Citations

大语言模型在科学研究中的应用

正文

点击下面卡片，快速关注本公众号 (adsbygoogle = window.adsbygoogle || []).push({});

LLMs for Scientific Hypothesis Discovery

LLMs for Experiment Planning and Implementation

Optimizing Experimental Design

Automating Experimental Processes

Data Preparation

Experiment Execution and Workflow Automation

Data Analysis and Interpretation

Benchmarks

LLMs for Scientific Paper Writing

Citation Text Generation

Related Work Generation

Drafting andWriting

Benchmarks

请到「今天看啥」查看全文

点击下面卡片，快速关注本公众号