AITurk 平台论文复现

2025-01-07


作者 :邱一崎 (四川大学)
邮箱 [email protected]

编者按 :本文主要整理自下文,特此致谢!

  • Qin X, Huang M, Ding J. AITurk: Using ChatGPT for Social Science Research[J]. Available at SSRN 4922861, 2024. -PDF-
  • Dang C T, Volpone S D, Umphress E E. The ethics of diversity ideology: Consequences of leader diversity ideology on ethical leadership perception and organizational citizenship behavior[J]. Journal of Applied Psychology, 2023, 108(2): 307. -Link-

1. 研究概要

中山大学的秦昕教授及其合作团队最近开展了一项研究,探讨了利用人工智能 (AI) 模拟真实人类被试参与研究的有效性。该研究开发了 AITurk 平台 (网址:www.aiturk.cc),借助 ChatGPT 的能力模仿人类反应,将 AI 作为被试来收集数据。研究团队复制了 2023 年 1 月至 6 月间在顶级心理学期刊上发表的 22 项研究,为 AI 模拟人类行为的应用奠定了基础。

结果显示,ChatGPT 成功复现了约 93.2% 的研究结论,准确率极高。与在线众包平台 Mturk 相比,AITurk 在收集数据时具有以下几点优势 (以收集 300 个被试为例):

  • 准确率高:论文复现中,AITurk 实现 93.2% 的准确度;
  • 成本低:AITurk 仅需 6 美元,而 MTurk 需要 180 美元;
  • 时间快:AITurk 仅需 0.5 min,而 MTurk 需要 720 min;
  • 高效能:每组仅 30 个被试的小样本即可复现结果。

但需要注意的是,AITurk 平台重点在复现主效应,而非调节效应或中介效应。因此本文将采用 AITurk 平台复现论文,并展示平台使用过程。

2. 数据背景

秦昕教授原文共复现了 22 项研究,本文选择了其中一篇论文进行复现,该论文为 The ethics of diversity ideology: Consequences of leader diversity ideology on ethical leadership perception and organizational citizenship behavior,以下是论文的假设及实验过程 (由于本文重点在于如何使用 AITurk 平台复现顶刊论文的数据,并验证基本假设,因此该论文的研究背景、基本概念等不做赘述)。

  • H1:与持有无身份意识观念 (colorblind, meritocracy, assimilation group) 的领导相比,追随者更容易感知到具有身份认同意识观念(如多元文化群体)领导的道德水平。
  • H2:追随者的制度性歧视意识调节了领导者的身份意识观念 (multicultural vs. colorblind, meritocracy, assimilation)。

在 AITurk 平台收集数据前,需要获取该实验的基本信息,包括被试的人口统计特征、实验设计、变量测量。


  • 样本总量:283 (为节约成本,本文仅用了 30 个样本来验证假设有效性)
  • 女性占比:43%
  • 平均年龄:31.42
  • 年龄方差:13.12
  • 实验地点:United States of America

实验设计 (为尽量保证输入进 AITurk 的内容与原文一致,因此下面内容都用英文展示):

  • 实验类型:基于情景的组间实验

  • 实验步骤: we used an adapted experimental design from previous leadership research (Dang et al., 2017). Participants were told that they were completing a multipart study about leadership. In Part 1, participants were instructed to write a short essay about a time when they were a leader, and what they were asked/task to do. We did this to set up the leader–follower simulation in Part 2, where participants were told that they would be randomly presented with another participant’s essay to review. In Part 3, participants were told that they would be participating in a team task led by the participant whose leadership essay they read in Part 2. To enhance realism, when participants submitted their personal essay in Part 1, the computer initiated a wait screen with a continuous circle until the supposedly randomly selected essay appeared for participants’ assessment. After completing Parts 1–3, participants were directed to an ostensibly different study where institutional discrimination awareness, controls, and demographics were measured. We manipulated leader diversity ideology in Part 2 by varying the statements the leader made in their essay about managing race/ ethnic diversity in the workplace.

  • 实验分组:4 组

    • multicultural 组: As someone who has had many work experiences, I can talk about a time when I was asked to deliver a speech to my work unit about how to manage diversity in the workplace. Personally, I believe that multiculturalism is the best way to approach diversity. We are all human But it is important to acknowledge and value people’s racial/ethnic differences. So as a leader, employees’ racial/ ethnic differences are important to me and should be valued. In fact, I think it’s important for leaders of companies to expose their followers to the different racial and ethnic groups within the company. As such, I like to create environments where our differences—both racial and ethnic— are recognized. I achieve this by acknowledging people’s heritages.
    • colorblind 组: As someone who has had many work experiences, I can talk about a time when I was asked to deliver a speech to my work unit about how to manage diversity in the workplace. Personally, I believe that being colorblind is the best way to approach diversity. We are all human. We should acknowledge and value our similarities rather than our racial/ethnic differences. So as a leader, employees’ racial/ ethnic differences aren’t that important to me and shouldn’t be overly valued. In fact, I think it’s important for leaders of companies to not expose their followers to the different racial and ethnic groups within the company. As such, I like to create environments where our differences—both racial and ethnic— are minimized. I achieve this by not acknowledging people’s heritages.
    • meritocracy 组: As someone who has had many work experiences, I can talk about a time when I was asked to deliver a speech to my work unit about how to manage diversity in the workplace. Personally, I believe that meritocracy is the best way to approach diversity. We are all human. If we judge people by their merits (meaning we judge them by their qualifications like their job-relevant knowledge, skills and abilities), we shouldn’t acknowledge and value our racial/ ethnic differences. So as a leader, so long as employees have been treated in a meritocratic way and judged based on their qualifications, employees’ racial/ethnic differences aren’t that important to me and shouldn’t be overly valued. In fact, I think it’s important for leaders of companies to not expose their followers to the different racial and ethnic groups within the company so long as the company operates under meritocratic ideologies. I achieve this by making sure that policies and procedures are meritocratic.
    • assimilation 组: As someone who has had many work experiences, I can talk about a time when I was asked to deliver a speech to my work unit about how to manage diversity in the workplace. Personally, I believe that assimilation is the best way to approach diversity. We are all human. We should put aside our racial/differences and assimilate to the mainstream group and the mainstream group’ norms. So as a leader, employees’ racial/ ethnic differences aren’t that important to me and shouldn’t be overly valued. In fact, I think it’s important for leaders of companies to not expose their followers to the different racial and ethnic groups within the company. I achieve this by having people blend in with their environment.
  • 变量测量 :复现时主要涉及两个变量,即道德领导感知 (Ethical Leadership Perception,因变量),制度性歧视意识 (Institutional Discrimination Awareness,调节变量)。

  • 道德领导感知 (Ethical Leadership Perception) 测量题项如下:

  1. The leader seems to conduct his/her personal life in an ethical manner.
  2. The leader seems to define success not just by results but also by the way that the results were obtained.
  3. The leader seems to listen to what employees have to say.
  4. It seems likely that the leader would discipline employees who violate ethical standards.
  5. It seems like the leader would make fair and balanced decisions.
  6. I think the leader can be trusted.
  7. It seems like the leader would discuss business ethics or values with employees.
  8. The leader seems to be able to set an example of how to do things the right way in terms of ethics.
  9. The leader seems to have the best interests of employees in mind.
  10. When making decisions, it seems like the leader would ask “what is the right thing to do?”
  • 制度性歧视意识 ( Institutional Discrimination Awareness ) 测量题项如下:

    1. Social policies, such as affirmative action, discriminate unfairly against white people.
    2. White people in the U.S. are discriminated against because of the color of their skin.
    3. English should be the only official language in the U.S.
    4. Due to racial discrimination, programs such as affirmative action are necessary to help create equality.
    5. Racial and ethnic minorities in the U.S. have certain advantages because of the color of their skin.
    6. It is important that people begin to think of themselves as American and not African American, Mexican American, or Italian American.
    7. Immigrants should try to fit into the culture and values of the U.S.

    收集完被试、实验和变量测量的基本信息后,就可以把这些复制粘贴进 AITurk 平台 ( www.aiturk.cc) 中了,部分截图如下:

    所有信息填完以后即可点击训练,就可以得到由AI组成的样本。本文只搜集了 30 个样本,训练时间 4.2 秒。接下来使用R对假设进行验证。

    3. 假设验证

    3.1 数据准备

    # 载入包
    data"AITurk_data (study_title=replicate learning, n=30).xlsx")

    var_institutional_dis"institutional discrimination_1","institutional discrimination_2","institutional discrimination_3","institutional discrimination_4","institutional discrimination_5","institutional discrimination_6","institutional discrimination_7")
    var_ethical_leadership_per"Ethical Leadership perception_1","Ethical Leadership perception_2","Ethical Leadership perception_3","Ethical Leadership perception_4","Ethical Leadership perception_5","Ethical Leadership perception_6","Ethical Leadership perception_7","Ethical Leadership perception_8","Ethical Leadership perception_9","Ethical Leadership perception_10")

    data$`institutional discrimination_4`8-data$`institutional discrimination_4`

    # 通过mean合并成summary variable


    3.2 主效应检验

    数据准备好以后,就可以开始主效应检验了,即验证 H1。


    结果表明,multicutural 组的道德领导感知均显著大于另外三组 (corlorblind, meritocracy, assimilation),假设 1 得到验证。

    3.3 调节效应检验



    方差分析的交互项显著表明组间差异显著,即制度性歧视意识与 leader diversity ideology 交互项显著影响 ethical leadership perception。原假设指出在制度性歧视意识较高水平时,multicultural 组的道德领导感知都高于另外三组 (corlorblind, meritocracy, assimilation)。但本文的结果仅支持 multicultural 组只高于其中两组 (corlorblind, meritocracy)。因此 H2 调节效应只得到部分验证。

    3.4 稳健性检验

    为了验证 H1 和 H2 的稳健性,本文在 AITurk 平台上分别搜集了三次数据 (见下面截图) ,并分开做了同样的主效应和调节效应分析,结果表明主效应稳定显著,但每一次样本得到的调节效应都发生变化。

    4. 结论


    • AITurk 在验证主效应方面稳健且高质量,能够复现顶刊文章。
    • AITurk 无法有效验证调节效应或中介效应。

