专栏名称: 统计之都
专业、人本、正直的中国统计学门户网站
目录
相关文章推荐
华泰睿思  ·  华泰 | 英伟达GTC前瞻:三大关注点 ·  19 小时前  
中国证券报  ·  实探 | 半小时左右就售罄 ·  昨天  
国泰君安证券研究  ·  国君研究丨深圳人形机器人产业链全景图 ·  2 天前  
中信建投证券研究  ·  中信建投:人形机器人投资机遇 ·  3 天前  
中国证券报  ·  10派25元!A股又现大额分红 ·  4 天前  
51好读  ›  专栏  ›  统计之都

云讲堂预告 | 夏冬:通过矩阵补全进行在线策略学习与推断

统计之都  · 公众号  ·  · 2024-05-08 12:00

正文


报告信息

主题 :Online Policy Learning and Inference by Matrix Completion

嘉宾 :夏冬

地点 :腾讯会议:920-361-274(或点击阅读原文)

时间 :2024年05月10日(周五)20:00

报告摘要


Making online decisions can be challenging when features are sparse and orthogonal to historical ones, especially when the optimal policy is learned through collaborative filtering. We formulate the problem as a matrix completion bandit (MCB), where the expected reward under each arm is characterized by an unknown low-rank matrix. The ε-greedy bandit and the online gradient descent algorithm are explored. Policy learning and regret performance are studied under a specific schedule for exploration probabilities and step sizes. A faster decaying exploration probability yields smaller regret but learns the optimal policy less accurately. We investigate an online debiasing method based on inverse propensity weighting (IPW) and a general framework for online policy inference. The IPW-based estimators are asymptotically normal under mild arm-optimality conditions. Numerical simulations corroborate our theoretical findings. Our methods are applied to the San Francisco parking pricing project data, revealing intriguing discoveries and outperforming the benchmark policy.


嘉宾简介

Dr. Dong XIA is an Assistant Professor in Department of Mathematics at Hong Kong University of Science and Technology. He was a Postdoctral Research Scientist in Department of Statistics at Columbia University and a Visiting Assistant Professor in Department of Statistics at University of Wisconsin-Madison. He obtained his Ph.D. in Computational Science and Enginering and Mathematics from Georgia Institute of Technology in 2016. His research interest lies in high-dimensional statistics, machine learning theory and optimization. He is currently an Associate Editor of Journal of Statistical Planning and Inference.







请到「今天看啥」查看全文