专栏名称: 机器学习研究会
机器学习研究会是北京大学大数据与机器学习创新中心旗下的学生组织,旨在构建一个机器学习从事者交流的平台。除了及时分享领域资讯外,协会还会举办各种业界巨头/学术神牛讲座、学术大牛沙龙分享会、real data 创新竞赛等活动。
目录
相关文章推荐
爱可可-爱生活  ·  几篇论文实现代码:《Scaling ... ·  昨天  
爱可可-爱生活  ·  《爱可可微博热门分享(9.29)》 ... ·  4 天前  
硅星GenAI  ·  AI周榜 | ... ·  6 天前  
硅星GenAI  ·  AI周榜 | ... ·  6 天前  
爱可可-爱生活  ·  [LG] A Survey of ... ·  1 周前  
爱可可-爱生活  ·  【汉语新解:一个AI驱动的项目,能够为中文词 ... ·  1 周前  
51好读  ›  专栏  ›  机器学习研究会

【学习】参加Kaggle数据挖掘比赛的两大利器: xgboost和模型融合(Model Ensemble)

机器学习研究会  · 公众号  · AI  · 2017-04-15 18:59

正文



点击上方“机器学习研究会”可以订阅哦
摘要
 

转自:KevinRush

参加Kaggle数据挖掘比赛的两大利器: xgboost和模型融合(Model Ensemble)。把多个单模型融合在一起能够降低bias,variance,控制Overfitting,提高准确率。下文解释了为什么Ensemble能够起到这些作用,还介绍了几种常用的Ensemble的方法: (weighted)vote, averaging, stacking,blending。


Model ensembling is a very powerful technique to increase accuracy on a variety of ML tasks. In this article I will share my ensembling approaches for Kaggle Competitions.


For the first part we look at creating ensembles from submission files. The second part will look at creating ensembles through stacked generalization/blending.


I answer why ensembling reduces the generalization error. Finally I show different methods of ensembling, together with their results and code to try it out for yourself.

This is how you win ML competitions: you take other peoples’ work and ensemble them together.”Vitaly Kuznetsov NIPS2014



Creating ensembles from submission files

The most basic and convenient way to ensemble is to ensemble Kaggle submission CSV files. You only need the predictions on the test set for these methods — no need to retrain a model. This makes it a quick way to ensemble already existing model predictions, ideal when teaming up.


Voting ensembles.

We first take a look at a simple majority vote ensemble. Let’s see why model ensembling reduces error rate and why it works better to ensemble low-correlated model predictions.


链接:

https://mlwave.com/kaggle-ensembling-guide/


代码链接:

https://github.com/MLWave/Kaggle-Ensemble-Guide


原文链接:

http://weibo.com/3983872447/EEqibcLw7?type=comment#_rnd1492246619714

“完整内容”请点击【阅读原文】
↓↓↓