点击下面卡片,快速关注本公众号
科学研究是一个严谨的系统性探索过程。传统模式下,研究人员首先收集背景知识、提出假设,然后设计并执行实验、收集和分析数据,最终通过同行评议的手稿报告研究结果。尽管这一循环过程推动了现代科学技术的进步,但其依然受到人类研究者的创造力、专业知识以及有限的时间和资源的制约。数十年来,科学界一直致力于通过自动化科学研究的某些方面来提高科学家的生产力。早期的计算机辅助研究可以追溯到20世纪70年代,例如“自动化数学家”和BACON系统,它们展示了机器在定理生成和经验定律识别等特定研究任务中的潜力。近年来,AlphaFold和OpenFold等系统成为自动化特定研究任务的先锋,显著加速了各自领域的研究进展。然而,只有随着基础模型的出现和大型语言模型的爆发式发展,在多个研究领域实现全面AI辅助的愿景才变得现实。LLMs,如GPT-4和LLaMA,在理解、生成和交互人类语言方面树立了新的标杆。其强大的能力,得益于海量数据集和创新架构,使其应用范围超越了传统的自然语言处理任务,扩展到更复杂和特定领域的挑战。尤其值得注意的是,LLMs处理海量数据、生成类人文本和辅助复杂决策的能力引起了科学界的广泛关注,预示着LLMs有潜力彻底改变科学研究的开展、记录和评估方式。
近年来,大型语言模型(LLMs)的快速发展深刻地改变了科学研究的格局,为研究周期的各个阶段提供了前所未有的支持。该论文作为首个专门探讨LLMs如何革新科学研究过程的系统性综述,深入分析了LLMs在科学研究的四个关键阶段所扮演的独特角色:科学假说发现、实验规划与实施、科学写作以及同行评议。该综述文章全面展示了针对特定任务的方法论和评估基准,并通过识别当前挑战和提出未来研究方向,不仅突出了LLMs的变革潜力,也旨在启发和指导研究人员和实践者利用LLMs来推进科学探索。
论文链接:https://arxiv.org/abs/2501.04306
以下是大语言模型在科学假说发现、实验规划与实施、科学写作以及同行评议等四个方面的论文、代码、软件工具等资源链接。
LLMs for Scientific Hypothesis Discovery
-
SciMON
SciMON: Scientific Inspiration Machines Optimized for Novelty
(May. 23, 2023; ACL 2024) (
https://arxiv.org/abs/2305.14259
)
-
MOOSE
Large Language Models for Automated Open-domain Scientific Hypotheses Discovery
(Sep. 6, 2023; ICML AI4Science Workshop Best Poster Award; ACL 2024) (
https://arxiv.org/abs/2309.02726
)
-
MCR
Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design
(Oct. 22, 2023; EMNLP 2023) (
https://arxiv.org/abs/2310.14420
)
-
Large language models are zero shot hypothesis proposers
(Nov. 10, 2023; COLM 2024) (
https://arxiv.org/abs/2311.05965
)
-
FunSearch
Mathematical discoveries from program search with large language models
(Dec. 14, 2023; Nature) (
https://www.nature.com/articles/s41586-023-06924-6
)
-
ChemReasoner
ChemReasoner: Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback
(Feb. 15, 2024; ICML 2024) (
https://arxiv.org/abs/2402.10980
)
-
SGA
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery
(May. 16, 2024; ICML 2024) (
https://arxiv.org/abs/2405.09783
)
-
AIScientist
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
(Aug. 12, 2024) (
https://arxiv.org/abs/2408.06292
)
-
MLR-Copilot
MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents
(Aug. 26, 2024) (
https://arxiv.org/abs/2408.14033
)
-
IGA
Can llms generate novel research ideas? a large-scale human study with 100+ nlp researchers
(Sep. 6, 2024) (
https://arxiv.org/abs/2409.04109
)
-
SciAgents
SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning
(Sep. 9, 2024) (
https://arxiv.org/abs/2409.05556
)
-
Scideator
Scideator: Human-LLM Scientific Idea Generation Grounded in Research-Paper Facet Recombination
(Sep. 23, 2024) (
https://arxiv.org/abs/2409.14634
)
-
MOOSE-Chem
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses
(Oct. 9, 2024) (
https://arxiv.org/abs/2410.07076
)
-
VirSci
Two Heads Are Better Than One: A Multi-Agent System Has the Potential to Improve Scientific Idea Generation
(Oct. 12, 2024) (
https://arxiv.org/abs/2410.09403
)
-
CoI
Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents
(Oct. 17, 2024) (
https://arxiv.org/abs/2410.13185
)
-
Nova
Nova: An Iterative Planning and Search Approach to Enhance Novelty and Diversity of LLM Generated Ideas
(Oct. 18, 2024) (
https://arxiv.org/abs/2410.14255
)
LLMs for Experiment Planning and Implementation
Optimizing Experimental Design
-
Coscientist
Autonomous chemical research with large language models
(Dec. 20, 2023) (
https://doi.org/10.1038/s41586-023-06792-0
)
-
ChemCrow
Augmenting large language models with chemistry tools
(May. 08, 2024) (
https://doi.org/10.1038/S42256-024-00832-8[
)
-
CRISPR-GPT
CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments
(Arp. 27, 2024) (
https://doi.org/10.48550/ARXIV.2404.18021
)
-
Navigating Complexity
Navigating Complexity: Orchestrated Problem Solving with Multi-Agent LLMs
(Jul. 10, 2024) (
https://arxiv.org/abs/2402.16713
)
-
HuggingGPT
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
(Dec. 03, 2024) (
https://arxiv.org/abs/2303.17580
)
-
AutoGen
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
(Oct. 03, 2023) (
https://doi.org/10.48550/ARXIV.2308.08155
)
-
LLM-RDF
An automatic end-to-end chemical synthesis development platform powered by large language models
(Nov. 23, 2024) (
https://www.nature.com/articles/s41467-024-54457-x
)
-
Simulating Expert Discussions with Multi-agent for Enhanced Scientific Problem Solving
(Jan. 23, 2024) (
https://aclanthology.org/2024.sdp-1.23/
)
Automating Experimental Processes
Data Preparation
-
Data-Juicer
Data-Juicer: A One-Stop Data Processing System for Large Language Models
(Dec. 20, 2023) (
https://arxiv.org/abs/2309.02033
)
-
Jellyfish
Jellyfish: A Large Language Model for Data Preprocessing
(Oct. 28, 2024) (
https://arxiv.org/abs/2312.01678
)
-
Can Large Language Models Transform Computational Social Science?
(Feb. 26, 2024) (hhttps://arxiv.org/abs/2305.03514)
-
CAAFE
Large Language Models for Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering
(Sep. 28, 2023) (
https://arxiv.org/abs/2305.03403
)
-
Are you in a Masquerade? Exploring the Behavior and Impact of Large Language Model Driven Social Bots in Online Social Networks.
(Jun. 19, 2023) (
https://arxiv.org/abs/2307.10337
)
-
Training Socially Aligned Language Models in Simulated Human Society
(Oct. 28, 2023) (
https://arxiv.org/abs/2305.16960
)
Experiment Execution and Workflow Automation
-
ESM-1b
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences
(Dec. 16, 2020) (
https://www.pnas.org/doi/full/10.1073/pnas.2016239118
)
-
ESM-2
Evolutionary-scale prediction of atomic-level protein structure with a language model
(Mar. 16, 2023) (
https://www.science.org/doi/10.1126/science.ade2574
)
-
Controllable protein design with language models
(Aug. 22, 2022) (
https://arxiv.org/abs/2201.07338
)
-
PALM-H3
De novo generation of SARS-CoV-2 antibody CDRH3 with a pre-trained generative large language model
(Aug. 10, 2024) (
https://www.nature.com/articles/s41467-024-50903-y
)
-
Coscientist
Autonomous chemical research with large language models
(Dec. 20, 2023) (
https://doi.org/10.1038/s41586-023-06792-0
)
-
ChemCrow
Augmenting large language models with chemistry tools
(May. 08, 2024) (
https://doi.org/10.1038/S42256-024-00832-8
)
-
Efficient Evolutionary Search Over Chemical Space with Large Language Models
(Jul. 02 2024) (
https://arxiv.org/abs/2406.16976
)
-
ChatDrug
Conversational Drug Editing Using Retrieval and Domain Feedback
(May. 29, 2023) (
https://arxiv.org/abs/2305.18090
)
-
DrugAssist
DrugAssist: A Large Language Model for Molecule Optimization
(Dec. 28, 2023) (
https://arxiv.org/abs/2401.10334
)
-
Bayesian Optimization of Catalysts With In-context Learning
(Apr. 18, 2024) (
https://arxiv.org/abs/2304.05341
)
-
Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design
(Oct. 22, 2023) (
https://arxiv.org/abs/2310.14420
)
-
ChemReasoner
CHEMREASONER: Heuristic Search over a Large Language Model’s Knowledge Space using Quantum-Chemical Feedback
(Dec. 09, 2024) (
https://arxiv.org/abs/2402.10980
)
Data Analysis and Interpretation
-
Automated Statistical Model Discovery with Language Models
(Jun. 22, 2024) (
https://arxiv.org/abs/2402.17879
)
-
MentaLLaMA
MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models
(Feb. 04, 2024) (
https://arxiv.org/abs/2309.13567
)
-
Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis
(Feb. 02, 2024) (
https://arxiv.org/abs/2402.01386
)
-
Opening a conversation on responsible environmental data science in the age of large language models
(May. 09, 2024) (
https://www.cambridge.org/core/journals/environmental-data-science/article/opening-a-conversation-on-responsible-environmental-data-science-in-the-age-of-large-language-models/95FD09526541A19436F3A18ADE332953
)
-
DSBench
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?
(Sep. 12, 2024) (
https://arxiv.org/abs/2409.07703
)
-
AutoGen
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
(Oct. 03, 2023) (
https://doi.org/10.48550/ARXIV.2308.08155
)
-
LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis
(Oct. 23, 2023) (
https://aclanthology.org/2023.findings-emnlp.669/
)
Benchmarks
-
SUPER
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories
(Sep. 11, 2024) ((
https://arxiv.org/abs/2409.07440
)
-
MLE-bench
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
(Dec. 20, 2023) (
https://arxiv.org/abs/2410.07095
)
-
ScienceAgentBench
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
(Oct. 07, 2024) (
https://arxiv.org/abs/2410.05080
)
-
Spider2-V
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
(Jul. 15, 2024) (
https://arxiv.org/abs/2407.10956
)
-
MLAgentBench
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation
(Oct. 05, 2023) (
https://arxiv.org/abs/2310.03302
)
-
DiscoveryWorld
DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
(Jun. 10, 2024) (
https://arxiv.org/abs/2406.06769
)
-
DSBench
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?
(Sep. 12, 2024) (
https://arxiv.org/abs/2409.07703
)
-
DS-1000
DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation
(Nov. 18, 2022) (
https://arxiv.org/abs/2211.11501
)
-
LAB-Bench
LAB-Bench: Measuring Capabilities of Language Models for Biology Research
(Jul. 14, 2024) (
https://arxiv.org/abs/2407.10362
)
-
AgentBench
AgentBench: Evaluating LLMs as Agents
(Aug. 07, 2023) (
https://arxiv.org/abs/2308.03688
)
-
TaskBench
TaskBench: Benchmarking Large Language Models for Task Automation
(Nov. 30, 2023) (
https://arxiv.org/abs/2311.18760
)
-
CORE-Bench
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
(Sep. 17, 2024) (
https://arxiv.org/abs/2409.11363
)
LLMs for Scientific Paper Writing
Citation Text Generation
-
Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study
(July. 30, 2020) (
https://aclanthology.org/2020.acl-main.550/
)
-
Explaining Relationships Among Research Papers
(Feb. 20, 2024) (
https://arxiv.org/abs/2402.13426
)
-
AutoCite
AutoCite: Multi-Modal Representation Fusion for Contextual Citation Generation
(Mar. 08, 2021) (
https://dl.acm.org/doi/10.1145/3437963.3441739
)
-
BACO
BACO: A Background Knowledge- and Content-Based Framework for Citing Sentence Generation
(Aug. 1, 2021) (
https://aclanthology.org/2021.acl-long.116/
)
-
Controllable Citation Sentence Generation with Language Models
(Nov. 14, 2022) (
https://arxiv.org/abs/2211.07066
)
-
Intent-Controllable Citation Text Generation
(May. 21, 2022) (
https://www.mdpi.com/2227-7390/10/10/1763
)
Related Work Generation
-
Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition
(Feb. 19, 2024) (
https://arxiv.org/abs/2402.12255
)
-
Leveraging Large Language Models for Literature Review Tasks - A Case Study Using ChatGPT
(Dec. 20, 2023) (
https://link.springer.com/chapter/10.1007/978-3-031-48858-0_25
)
-
LitLLM
LitLLM: A Toolkit for Scientific Literature Review
(Fe. 02, 2024) (
https://arxiv.org/abs/2402.01788
)
-
HiReview
HiReview: Hierarchical Taxonomy-Driven Automatic Literature Review Generation
(Oct. 02, 2024) (
https://arxiv.org/abs/2410.03761
)
-
Towards a Unified Framework for Reference Retrieval and Related Work Generation
(Dec. 06, 2023) (
https://aclanthology.org/2023.findings-emnlp.385
)
-
Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning
(Apr. 08, 2024) (
https://arxiv.org/abs/2404.08680
)
-
Reinforced Subject-Aware Graph Neural Network for Related Work Generation
(Jul. 26, 2024) (
https://link.springer.com/chapter/10.1007/978-981-97-5492-2_16
)
-
Toward Structured Related Work Generation with Novelty Statements
(Jul. 26, 2024) (
https://aclanthology.org/2024.sdp-1.5
)
Drafting andWriting
-
Generating Scientific Definitions with Controllable Complexity
(May. 22, 2022) (
https://aclanthology.org/2022.acl-long.569
)
-
SciCap
SciCap: Generating Captions for Scientific Figures
(Nov. 07, 2021) (
https://aclanthology.org/2021.findings-emnlp.277
)
-
CoAuthor
CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities
(Apr. 29, 2022) (
https://arxiv.org/abs/2407.02352
)
-
Autonomous LLM-driven research from data to human-verifiable research papers
(Apr. 24, 2024) (
https://arxiv.org/abs/2404.17605
)
-
PaperRobot
PaperRobot: Incremental Draft Generation of Scientific Ideas
(Jun. 28, 2019) (
https://aclanthology.org/P19-1191
)
-
AutoSurvey
AutoSurvey: Large Language Models Can Automatically Write Surveys
(Jun. 10, 2024) (
https://arxiv.org/abs/2406.10252
)
-
AI Scientist
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
(Aug. 12, 2024) (
https://arxiv.org/abs/2408.06292
)
-
CycleResearcher
CycleResearcher: Improving Automated Research via Automated Review
(Oct. 28, 2024) (
https://arxiv.org/abs/2411.00816
)
Benchmarks