专栏名称: 机器学习研究会

机器学习研究会是北京大学大数据与机器学习创新中心旗下的学生组织，旨在构建一个机器学习从事者交流的平台。除了及时分享领域资讯外，协会还会举办各种业界巨头/学术神牛讲座、学术大牛沙龙分享会、real data 创新竞赛等活动。

【推荐】Pandas实用技巧：减少90%的内存占用

机器学习研究会 · 公众号 · AI · 2017-08-11 22:28

正文

点击上方 “机器学习研究会” 可以订阅

摘要

转自：爱可可-爱生活

When working using pandas with small data (under 100 megabytes), performance is rarely a problem. When we move to larger data (100 megabytes to multiple gigabytes), performance issues can make run times much longer, and cause code to fail entirely due to insufficient memory.

While tools like Spark can handle large data sets (100 gigabytes to multiple terabytes), taking full advantage of their capabilities usually requires more expensive hardware. And unlike pandas, they lack rich feature sets for high quality data cleaning, exploration, and analysis. For medium-sized data, we’re better off trying to get more out of pandas, rather than switching to a different tool.

请到「今天看啥」查看全文

推荐文章

单读 · 鲍勃·迪伦前的美国音乐 | 每周一书

8 年前

生态梦网 · 楼市调控祭出“王炸” 炒房客被逼入绝境房价真的要降了？

7 年前

时拾史事 · 换种方式讲西汉76 | 李少君身死，阿娇入冷宫

7 年前

健康与养身 · 夏季喝一口，百病马上走（奇妙小东西，竟有大作用）

7 年前

互联网思想 · “如果你想改变世界”——海豹突击队训练给我们的十条启示

7 年前