编者按
:本文部分内容摘译自下文,特此致谢!
Source
:Cattaneo M D, Idrobo N, Titiunik R. A practical introduction to regression discontinuity designs: Foundations[M]. Cambridge University Press, 2019.
-PDF-
* 数据初步整理 use "https://file.lianxh.cn/data/RDD_Cattaneo_regdata0.dta", clear * save RDD_Cattaneo_regdata0.dta, replace // 保存到本地 * use RDD_Cattaneo_regdata0.dta, clear rename hischshr1520f Y rename i94 T gen X = vi1*100 replace Y = Y*100 cls
* 对应的 Stata 代码 reg Y X if X < 0 & X >= -20 matrix coef_left = e(b) local intercept_left = coef_left[1, 2] reg Y X if X >= 0 & X <= 20 matrix coef_right = e(b) local intercept_right = coef_right[1, 2] local difference = `intercept_right' - `intercept_left'
dis "The RD estimator is `difference'"
# 对应的 R 代码
out = lm(Y[X 0 & X >= -20] ~ X[X 0 & X >= -20]) left_intercept = out$coefficients[1] out = lm(Y[X >= 0 & X <= 20] ~ X[X >= 0 & X <= 20]) right_intercept = out$coefficients[1] difference = right_intercept - left_intercept print(paste("The RD estimator is", difference, sep = " "))
. reg Y X if X < 0 & X >= -20
Source | SS df MS Number of obs = 610 -------------+---------------------------------- F(1, 608) = 12.65 Model | 1114.58439 1 1114.58439 Prob > F = 0.0004 Residual | 53577.4144 608 88.1207473 R-squared = 0.0204 -------------+---------------------------------- Adj R-squared = 0.0188 Total | 54691.9988 609 89.8062377 Root MSE = 9.3873
------------------------------------------------------------------------------ Y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- X | -.2398405 .0674381 -3.56 0.000 -.3722804 -.1074007 _cons | 12.73323 .7757093 16.41 0.000 11.20983 14.25662 ------------------------------------------------------------------------------
. reg Y X if X >= 0 & X <= 20
Source | SS df MS Number of obs = 282 -------------+---------------------------------- F(1, 280) = 0.93 Model | 78.2903403 1 78.2903403 Prob > F = 0.3361 Residual | 23607.8344 280 84.3136941 R-squared = 0.0033 -------------+---------------------------------- Adj R-squared = -0.0003 Total | 23686.1247 281 84.2922587 Root MSE = 9.1822
------------------------------------------------------------------------------ Y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- X | -.0989301 .1026652 -0.96 0.336 -.3010237 .1031636 _cons | 15.29072 .9384934 16.29 0.000 13.44332 17.13812 ------------------------------------------------------------------------------
* 对应的 Stata 代码 gen weights = . replace weights = (1 - abs(X / 20)) if X < 0 & X >= -20 replace weights = (1 - abs(X / 20)) if X >= 0 & X <= 20
reg Y X [aw = weights] if X < 0 & X >= -20 matrix coef_left = e(b) local intercept_left = coef_left[1, 2] reg Y X [aw = weights] if X >= 0 & X <= 20 matrix coef_right = e(b) local intercept_right = coef_right[1, 2] local difference = `intercept_right' - `intercept_left'
dis "The RD estimator is `difference'"
# 对应的 R 代码 w = NA w[X 0 & X >= -20] = 1 - abs(X[X 0 & X >= -20]/20) w[X >= 0 & X <= 20] = 1 - abs(X[X >= 0 & X <= 20]/20)
* 对应的 Stata 代码 rdrobust Y X, kernel(uniform) p(1) h(20)
# 对应的 R 代码 out = rdrobust(Y, X, kernel = "uniform", p = 1, h = 20)
. rdrobust Y X, kernel(uniform) p(1) h(20)
Sharp RD estimates using local polynomial regression.
Cutoff c = 0 | Left of c Right of c Number of obs = 2634 -------------------+---------------------- BW type = Manual Number of obs | 2317 317 Kernel = Uniform Eff. Number of obs | 610 282 VCE method = NN Order est. (p) | 1 1 Order bias (q) | 2 2 BW est. (h) | 20.000 20.000 BW bias (b) | 20.000 20.000 rho (h/b) | 1.000 1.000
在初步使用
rdrobust
命令时,可以使用
kernel
设定核函数形式,
p
设定多项式阶数,
h
设定带宽。结果中包含了很多细节,右上角显示,回归中使用的样本个数为 2634 ,核函数为均匀核函数,手动选择带宽大小,未使用稳健标准误。左侧的结果类似于
rdplot
的结果,分别报告了断点左右的样本个数、估计中使用的样本数量、带宽大小和多项式阶数。最后报告的是点估计结果,从中可知
与分组回归的结果非常近似。
使用
rdrobust
可以非常轻易地使用三角核函数,只需要改变选项中的参数即可。
* 对应的 Stata 代码 rdrobust Y X, kernel(Triangular) p(1) h(20)
# 对应的 R 代码 out = rdrobust(Y, X, kernel = "Triangular", p = 1, h = 20)
. rdrobust Y X, kernel(triangular) p(1) h(20)
Sharp RD estimates using local polynomial regression.
Cutoff c = 0 | Left of c Right of c Number of obs = 2634 -------------------+---------------------- BW type = Manual Number of obs | 2317 317 Kernel = Triangular Eff. Number of obs | 610 282 VCE method = NN Order est. (p) | 1 1 Order bias (q) | 2 2 BW est. (h) | 20.000 20.000 BW bias (b) | 20.000 20.000 rho (h/b) | 1.000 1.000
* 对应的 Stata 代码 rdrobust Y X, kernel(Triangular) p(2) h(20)
# 对应的 R 代码 out = rdrobust(Y, X, kernel = "Triangular", p = 2, h = 20)
. rdrobust Y X, kernel(triangular) p(2) h(20)
Sharp RD estimates using local polynomial regression.
Cutoff c = 0 | Left of c Right of c Number of obs = 2634 -------------------+---------------------- BW type = Manual Number of obs | 2317 317 Kernel = Triangular Eff. Number of obs | 610 282 VCE method = NN Order est. (p) | 2 2 Order bias (q) | 3 3 BW est. (h) | 20.000 20.000 BW bias (b) | 20.000 20.000 rho (h/b) | 1.000 1.000
* 对应的 Stata 代码 rdbwselect Y X, kernel(triangular) p(1) bwselect(mserd)
# 对应的 R 代码 out = rdbwselect(Y, X, kernel = "triangular", p = 1, bwselect = "mserd")
. rdbwselect Y X, kernel(triangular) p(1) bwselect(mserd)
Bandwidth estimators for sharp RD local polynomial regression.
Cutoff c = | Left of c Right of c Number of obs = 2634 -------------------+---------------------- Kernel = Triangular Number of obs | 2317 317 VCE method = NN Min of X | -100.000 0.000 Max of X | -0.098 99.051 Order est. (p) | 1 1 Order bias (q) | 2 2
Outcome: Y. Running variable: X. -------------------------------------------------------------------------------- | BW est. (h) | BW bias (b) Method | Left of c Right of c | Left of c Right of c -------------------+------------------------------+----------------------------- mserd | 18.034 18.034 | 30.330 30.330 --------------------------------------------------------------------------------
* 对应的 Stata 代码 rdbwselect Y X, kernel(triangular) p(1) bwselect(msetwo)
# 对应的 R 代码 out = rdbwselect(Y, X, kernel = "triangular", p = 1, bwselect = "msetwo")
. rdbwselect Y X, kernel(triangular) p(1) bwselect(msetwo)
Bandwidth estimators for sharp RD local polynomial regression.
Cutoff c = | Left of c Right of c Number of obs = 2634 -------------------+---------------------- Kernel = Triangular Number of obs | 2317 317 VCE method = NN Min of X | -100.000 0.000 Max of X | -0.098 99.051 Order est. (p) | 1 1 Order bias (q) | 2 2
Outcome: Y. Running variable: X. -------------------------------------------------------------------------------- | BW est. (h) | BW bias (b) Method | Left of c Right of c | Left of c Right of c -------------------+------------------------------+----------------------------- msetwo | 19.371 15.830 | 31.017 26.432 --------------------------------------------------------------------------------
* 对应的 Stata 代码 rdrobust Y X, kernel(triangular) p(1) bwselect(mserd)
# 对应的 R 代码 out = rdrobust(Y, X, kernel = "triangular", p = 1, bwselect = "mserd")
. rdrobust Y X, kernel(triangular) p(1) bwselect(mserd)
Sharp RD estimates using local polynomial regression.
Cutoff c = 0 | Left of c Right of c Number of obs = 2634 -------------------+---------------------- BW type = mserd Number of obs | 2317 317 Kernel = Triangular Eff. Number of obs | 551 272 VCE method = NN Order est. (p) | 1 1 Order bias (q) | 2 2 BW est. (h) | 18.034 18.034 BW bias (b) | 30.330 30.330 rho (h/b) | 0.595 0.595
* 对应的 Stata 代码 rdrobust Y X, p(1) kernel(triangular) bwselect(mserd) local bandwidth = e(h_l) rdplot Y X if abs(X) <= `bandwidth', p(1) h(`bandwidth') kernel(triangular)
# 对应的 R 代码 bandwidth = rdrobust(Y, X, kernel = "triangular", p = 1, bwselect = "mserd")$bws[1,1] out = rdplot(Y[abs(X) <= bandwidth], X[abs(X) <= bandwidth], p = 1, kernel = "triangular", cex.axis = 1.5, cex.lab = 1.5)