[LG]《Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models》J Singh, T Chakraborty, A Nambi [Microsoft Research & IIT Delhi] (2025)
网页链接
#机器学习#
#人工智能#
#论文#
#AI创造营#