专栏名称: 曾鸣书院
阿里巴巴总参谋长曾鸣发起成立的知识共创社区。
目录
相关文章推荐
潇湘晨报  ·  “条纹哥”账号被封 ·  2 天前  
潇湘晨报  ·  叶珂已注销账号! ·  3 天前  
株洲交通频道广播电台  ·  今晚,油价…… ·  3 天前  
51好读  ›  专栏  ›  曾鸣书院

什么叫“大规模杀伤数器”? | 书摘

曾鸣书院  · 公众号  ·  · 2017-07-04 17:49

正文


1. [不健康的数据模型]擅自定义“现实”,并且把这样的现实用来自证其说,这样的模型自说自话、害人不浅,甚至于,极其普遍。

They define their own reality and use it to justify their results. This type of model is self-perpetuating, highly destructive—and very common. 


2. 一套数据模型的盲区,也反应建模者的价值判断和内心中的轻重缓急。

model’s blind spots reflect the judgments and priorities of its creators. 


3. 我们的价值观、我们的欲望,影响我们种种决策:收集哪些数据,提起什么样的问题。所谓的模型,是指内嵌于数学的观点。

Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics. 



4. 在诸如谷歌的“大数据公司”里头,数据科学家天天进行一系列实验,监控几千上万个指标,可以随意把一条广告里面的蓝色字体改为红色,把两个版本分发到千万个人面前,然后记录哪一条广告的点击量更大,这样的有效反馈,是用来调整算法、精细优化产品体验。我对谷歌的数据操作持有许多保留意见,但是至少,这样的实验相当有效,符合统计学规律。

At Big Data companies like Google, by contrast, researchers run constant tests and monitor thousands of variables. They can change the font on a single advertisement from blue to red, serve each version to ten million people, and keep track of which one gets more clicks. They use this feedback to hone their algorithms and fine- tune their operation. While I have plenty of issues with Google, which we’ll get to, this type of testing is an effective use of statistics. 


5. 若亚马逊的推荐模型出现相关性偏差,比如,开始给00后女孩子推的书都是草坪修剪的工具书,网站的点击数必定骤降,而工程师也会以此为反馈,持续调整模型,直到匹配妥当为止。若没有这样的反馈,统计引擎不断输出的分析结果,或有偏差乃至害处,更重要的是,模型从不知错,无从改进

If Amazon. com, through a faulty correlation, started recommending lawn care books to teenage girls, the clicks would plummet, and the algorithm would be tweaked until it got it right. Without feedback, however, a statistical engine can continue spinning out faulty and damaging analysis while never learning from its mistakes.


6. [相较于警察局预测累犯行为],亚马逊不甘依赖牵强的相关性判断。公司的业务就是数据实验室,如果需要了解购物的“累犯行为”,一定会展开研究。而研究的对象,不是按邮编分类的区域或者民众的受教育水平,公司也分析人在亚马逊生态系统里面的体验。他们可能会从购买了一两次之后再也不来的人开始着手。是不是结账付款的时候流程出问题了?包裹有没有按时送到?这样的人提出负面评价的比例会不会更高?问题诸如此类,因为研究层出不穷的问题就是公司的命脉,公司未来的发展取决于不断学习的数据模型,不断洞察消费者行为特征的数据系统。

Amazon does not settle for such glib correlations. The company runs a data laboratory. And if it wants to find out what drives shopping recidivism, it carries out research. Its data scientists don’t just study zip codes and education levels. They also inspect people’s experience within the Amazon ecosystem. They might start by looking at the patterns of all the people who shopped once or twice at Amazon and never returned. Did they have trouble at checkout? Did their packages arrive on time? Did a higher percentage of them post a bad review? The questions go on and on, because the future of the company hinges upon a system that learns continually, one that figures out what makes customers tick. 



7. 连记录个体行为的模型也如此,其提炼“理解”或评估风险的方式,也是人与人之间比对,从而组合成行为相仿的群体,背后的预测逻辑是行为相仿者风险同等。

even the models that track our personal behavior gain many of their insights, and assess risk, by comparing us to others… they assemble groups of us who act in similar ways. The prediction is that those who act alike will take on similar levels of risk. 


8. 已经有保险公司通过数据挖掘,把我们分散成越来越细的小部落,以不同的价钱推送不同的商品和服务。有人称之为“个性化,”但实际上,这样的“个性”的主体不是个体。模型将我们划分成种种群体,群体的行为画像类似于我们,但群体的边界我们看不见。无论分类精确与否,由于分类逻辑不予披露,公司容易实行掠夺性定价。

Already, insurers are using data to divide us into smaller tribes, to offer us different products and services at varying prices. Some might call this customized service. The trouble is, it’s not individual. The models place us into groups we cannot see, whose behavior appears to resemble ours. Regardless of the quality of the analysis, its opacity can lead to gouging. 



9. 展望未来,行为数据汪洋大海,必兴怒涛,激流直入人工智能的计算系统,这些系统用肉眼看,怎么也看不清究竟。经过这样的运算程序,我们很少有机会得知自己“属于”何种部落,遑论自己之所以划入我们所属部落的原因。

My point is that oceans of behavioral data, in coming years, will feed straight into artificial intelligence systems. And these will remain, to human eyes, black boxes. Throughout this process, we will rarely learn about the tribes we “belong” to or why we belong there. 


戳今日三条

我们彼此之间建立更深的了解

曾鸣书院开通了个人微信号:小鸣

微信ID:zmsy-xiaoming

长按识别二维码,添加小鸣为好友

入群讨论、话题分享、实时反馈

让我们联系更紧密一些