专栏名称: 连享会

连玉君老师团队分享，主页：lianxh.cn。白话计量，代码实操；学术路上，与君同行。

Stata：共同因子解读及 xtdcce2 命令应用

连享会 · 公众号 · · 2024-12-20 22:00

正文

👇 连享会 · 推文导航 | www.lianxh.cn

🍎 Stata： Stata基础 | Stata绘图 | Stata程序 | Stata新命令
📘 论文：数据处理 | 结果输出 | 论文写作 | 数据分享
💹 计量：回归分析 | 交乘项-调节 | IV-GMM | 时间序列 | 面板数据 | 空间计量 | Probit-Logit | 分位数回归
⛳ 专题： SFA-DEA | 生存分析 | 爬虫 | 机器学习 | 文本分析
🔃 因果： DID | RDD | 因果推断 | 合成控制法 | PSM-Matching
🔨 工具：工具软件 | Markdown | Python-R-Stata
🎧 课程：最新专题 | 计量专题 | 关于连享会

🍓 课程推荐：连享会：2025 寒假班
嘉宾：连玉君（初级|高级）；杨海生（前沿）
时间：2025 年 1 月 13-24 日
咨询：王老师 18903405450（微信）

温馨提示: 文中链接在微信中无法生效。请点击底部「阅读原文」。或直接长按/扫描如下二维码，直达原文：

作者：王胜文 (山东财经大学)
邮箱：[email protected]

编者按 ：本文编译自「Illustrating the consequences of modeling common factors with Stata (Updated with xtdcce2)」，特此致谢！

1. 前言

这篇文章的灵感来源于与 Hiro Ito、Eric Clower 和 Kamila Kuziemska-Pawlak 的精彩讨论。

本文作者 Jamel 在 2015 年撰写了一篇论文 Global Imbalances: Should We Use Fundamental Equilibrium Exchange Rates?，并试图回答以下问题：是否应该使用基本均衡汇率来减少全球失衡？

2. 正文

这篇论文是对 Zhou (1993) 的论文 Fundamental Equilibrium Exchange Rates and Exchange Rate Dynamics 的回应。Zhou 认为，汇率不会恢复到其均衡值，因为汇率可以通过购买力平价或费用支出模型进行估计。

毕竟，如果这些均衡汇率对实际汇率具有任何预测性，那么当前的研究为何将它们视为均衡值呢？均衡汇率只是理想的汇率反事实值。然而，本研究暂时不会深入探讨这些理论问题。相对而言，本研究将展示不引入共同因素的后果，以及这对预测模型可能产生的重要影响。

本研究中使用的数据集和代码可以在 GitHub 上获取。同时，本研究使用以下 Stata 命令进行 Pooled Mean Group (混合均值组) 估计。

. ssc install xtpmg, replace
. lxhuse xrdynamics, clear 
. xtpmg d.logreer d.logfeer, lr(l.logreer logfeer) ec(ec) replace pmg full

Panel Variable (i): cn                          Number of obs      =       728
Time Variable (t): datayear                     Number of groups   =        26
                                                Obs per group: min =        28
                                                               avg =      28.0
                                                               max =        28
                                                Log Likelihood     =   1169.86
------------------------------------------------------------------------------
   D.logreer | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ec           |
     logfeer |      0.643      0.062    10.32   0.000        0.521       0.765
-------------+----------------------------------------------------------------
cn_1         |
          ec |     -0.236      0.126    -1.88   0.060       -0.482       0.010
             |
     logfeer |
         D1. |      0.284      0.153     1.85   0.064       -0.016       0.584
             |
       _cons |      0.421      0.235     1.79   0.073       -0.039       0.881
-------------+----------------------------------------------------------------
cn_2         |
          ec |     -0.119      0.068    -1.75   0.080       -0.253       0.014
             |
     logfeer |
         D1. |     -0.015      0.060    -0.26   0.797       -0.132       0.101
             |
       _cons |      0.192      0.121     1.59   0.112       -0.045       0.429
-------------+----------------------------------------------------------------
cn_3         |
          ec |     -0.198      0.091    -2.18   0.029       -0.376      -0.020
             |
     logfeer |
         D1. |      0.057      0.182     0.31   0.755       -0.300       0.413
             |
       _cons |      0.329      0.171     1.93   0.054       -0.006       0.663
-------------+----------------------------------------------------------------
……

接下来，本文使用以下代码生成残差项，并将生成的残差项与实际汇率变化的观测值进行对比，最终生成了下图：

. xtline d.logreer yhat

红色线条的预测结果符合预期；然而，我们希望使用以下命令来测试残差项的横截面相关性 (样本中包含欧元区国家)：

. ssc install xtcdf, replace
. xtcdf residuals

xtcd test on variables residuals
Panelvar: cn
Timevar: datayear
------------------------------------------------------------------------------+
    Variable    |  CD-test   p-value   average joint T | mean ρ   mean abs(ρ) |
----------------+--------------------------------------+----------------------|
   residuals    +  1.256      0.209         28.00      +  0.01       0.22     | 
------------------------------------------------------------------------------+
 Notes: Under the null hypothesis of cross-section independence, CD ~ N(0,1)
        P-values close to zero indicate data are correlated across panel groups.

通过上述结果可以发现，拟合效果良好，但 CD 检验的值仅为 20.9%，与实际值相比可能偏低。该数据集属于平衡面板，且 T=28 (年)。

本文通过在短期和长期关系中，采用被解释变量和解释变量的横截面平均值，来估计观察汇率与均衡汇率之间的相同关系。使用的代码如下：

. xtpmg d.logreer d.logfeer d.logreer_cs d.logfeer_cs,  ///
>         lr(l.logreer logfeer l.logreer_cs logfeer_cs) ///
>         ec(ec) replace pmg full


Panel Variable (i): cn                          Number of obs      =       728
Time Variable (t): datayear                     Number of groups   =        26
                                                Obs per group: min =        28
                                                               avg =      28.0
                                                               max =        28
                                                Log Likelihood     =  1253.734
------------------------------------------------------------------------------
   D.logreer | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ec           |
     logfeer |      0.738      0.057    12.85   0.000        0.625       0.850
             |
  logreer_cs |
         L1. |     -0.028      0.161    -0.18   0.860       -0.343       0.286
             |
  logfeer_cs |     -0.187      0.117    -1.60   0.109       -0.416       0.042
-------------+----------------------------------------------------------------
cn_1         |
          ec |     -0.189      0.113    -1.68   0.093       -0.411       0.032
             |
     logfeer |
         D1. |      0.351      0.145     2.41   0.016        0.066       0.636
             |
  logreer_cs |
         D1. |      3.830      1.354     2.83   0.005        1.175       6.484
             |
  logfeer_cs |
         D1. |     -2.945      1.145    -2.57   0.010       -5.190      -0.701
             |
       _cons |      0.430      0.283     1.52   0.129       -0.126       0.986
-------------+----------------------------------------------------------------
cn_2         |
          ec |     -0.112      0.057    -1.98   0.047       -0.224      -0.001
             |
     logfeer |
         D1. |     -0.021      0.057    -0.36   0.717       -0.132       0.091
             |
  logreer_cs |
         D1. |     -0.139      0.126    -1.10   0.271       -0.386       0.108
             |
  logfeer_cs |
         D1. |     -0.136      0.102    -1.33   0.182       -0.336       0.064
             |
       _cons |      0.246      0.144     1.71   0.088       -0.037       0.528
-------------+----------------------------------------------------------------
……

接下来，本文使用以下代码再次生成预测值，并将其与实际汇率变化的观测值绘制成图：

. capture drop yhat
. gen yhat = .
. forval cn = 1/26 {
  2.     predict temp if cn ==`cn', eq(cn_`cn')
  3.     replace yhat = temp if cn == `cn'
  4.     drop temp
  5. }
. capture drop residuals_cs
. gen residuals_cs = d.logreer - yhat
. xtline d.logreer yhat

现在，这种模式似乎更适合一些国家，比如阿根廷和哥伦比亚。

正如论文 Identifying Exchange Rate Common Factors 中所阐述的观点，因为交易所中的某些共同因素对金融一体化的影响变得越来越大。

所以，本文再一次测试了残差的横截面相关性:

. xtcdf residuals_cs

xtcd test on variables residuals_cs
Panelvar: cn
Timevar: datayear
------------------------------------------------------------------------------+
    Variable    |  CD-test   p-value   average joint T | mean ρ   mean abs(ρ) |
----------------+--------------------------------------+----------------------|
  residuals_cs  +  .162       0.871         28.00      +  0.00       0.21     | 
------------------------------------------------------------------------------+
 Notes: Under the null hypothesis of cross-section independence, CD ~ N(0,1)
        P-values close to zero indicate data are correlated across panel groups.

可以发现，CD 检验的值从 20% 上升到 80%。所以，在这个例子中，由“残差的横截面相关性”引起的问题，已经顺利得到解决。

在 Rodolphe Desbordes 和 Jan Ditzen 精彩的讨论之后，我还想在横截面平均值中添加几个滞后项，以此检验对横截面的影响。但是，Rodolphe 强调这可能会影响最终的估计结果。

最近，我有机会在办公室再次见到 Jan，我们讨论了他在 Stata 期刊上发表的关于这个问题的论文：

Ditzen, J. (2018). Estimating dynamic common-correlated effects in Stata. The Stata Journal, 18(3), 585-617. -PDF-

我们运行下面的代码，进行指数估算，并检验变量横截面的相关性：

. xtcse2 logreer logfeer

Panel Variable (i): cn
Time Variable (t): datayear
Estimation of Cross-Sectional Exponent (alpha)
----------------------------------------------------------------
       variable|     alpha   Std. Err.    [95% Conf. Interval]
---------------+------------------------------------------------
        logreer|  .4638829   .0566748     .3528023    .5749635
        logfeer|  .6589229   .0404036     .5797333    .7381124
----------------------------------------------------------------
0.5 <= alpha < 1 implies strong cross-sectional dependence.

可以发现，这些变量可能受到了某些横截面相关性的影响。

现在，研究将使用 xtdcce2 命令来估计 PMG 模型。

. xtdcce2 d.logreer d.logfeer, lr(L.logreer logfeer) ///
>         p(L.logreer logfeer) cross(_all) cr_lags(0)  exponent

Panel Variable (i): cn                           Number of obs     =        728
Time Variable (t): datayear                      Number of groups  =         26
Degrees of freedom per group:                    Obs per group:    
 without cross-sectional avg. min   = 24                       min =         28
                              max   = 24                       avg =         28
 with cross-sectional avg.    min   = 22                       max =         28
                              max   = 22
Number of                                        F(106, 622)       =       4.72
 cross-sectional lags               0 to 0       Prob > F          =       0.00
 variables in mean group regression = 28         R-squared         =       0.55
 variables partialled out           = 78         Adj. R-squared    =       0.48
                                                 Root MSE          =       0.07
-------------------------------------------------------------------------------
      D.logreer|     Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+---------------------------------------------------------------
 Short Run Est.|
---------------+---------------------------------------------------------------
   Mean Group: |
      D.logfeer|  .1980608   .0541099    3.66    0.000      .0920074   .3041142
---------------+---------------------------------------------------------------
 Adjust. Term  |
---------------+---------------------------------------------------------------
   Pooled:     |
      L.logreer|  -.318621   .0932028   -3.42    0.001     -.5012951  -.1359469
---------------+---------------------------------------------------------------
 Long Run Est. |
---------------+---------------------------------------------------------------
   Pooled:     |
        logfeer|   .579443   .2636656    2.20    0.028      .0626679   1.096218
-------------------------------------------------------------------------------

由于估计命令的算法略有不同，所以导致不同模型的结果也有些不同。更多详情和示例请参阅 Ditzen (2021)。

在样本中，我们剔除了 78 个变量 (其中有 26 个常数，logreer_cs 和 logfeer_cs 在时间单位第 T 期中的横截面平均值，26×2×1)。

在均值组回归中变量的数量为 28 (26 个 D.logfeer 的短期系数；以及 logreer 和 logfeer 的 2 个截面平均变量)。CD 测试的值为 75%。

更多关于“剔除变量以消除影响”的论文，可以查阅 Testing for slope heterogeneity in Stata，以及 Simple Alternatives to the Common Correlated Effects Model。

现在，我在横截面平均值中添加了两个滞后项：

. xtdcce2 d.logreer d.logfeer, lr(L.logreer logfeer) ///
>         p(L.logreer logfeer) cross(_all) cr_lags(2) exponent

Panel Variable (i): cn                           Number of obs     =        676
Time Variable (t): datayear                      Number of groups  =         26
Degrees of freedom per group:                    Obs per group:    
 without cross-sectional avg. min   = 22                       min =         26
                              max   = 22                       avg =         26
 with cross-sectional avg.    min   = 16                       max =         26
                              max   = 16
Number of                                        F(210, 466)       =       1.70
 cross-sectional lags               2 to 2       Prob > F          =       0.00
 variables in mean group regression = 28         R-squared         =       0.57
 variables partialled out           = 182        Adj. R-squared    =       0.37
                                                 Root MSE          =       0.07
-------------------------------------------------------------------------------
      D.logreer|     Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+---------------------------------------------------------------
 Short Run Est.|
---------------+---------------------------------------------------------------
   Mean Group: |
      D.logfeer|  .2203299   .0586742    3.76    0.000      .1053305   .3353293
---------------+---------------------------------------------------------------
 Adjust. Term  |
---------------+---------------------------------------------------------------
   Pooled:     |
      L.logreer| -.2919711   .0964745   -3.03    0.002     -.4810577  -.1028846
---------------+---------------------------------------------------------------
 Long Run Est. |
---------------+---------------------------------------------------------------
   Pooled:     |
        logfeer|  .6604173   .2884739    2.29    0.022      .0950189   1.225816
-------------------------------------------------------------------------------

这里，我们剔除了 182 个变量 (其中包含 26 个常数，logreer_cs 和 logfeer_cs 在时间单位第 T 期、T+1 期、T+2 期中的横截面平均值，26×2×3)。在均值组回归中变量的数量为 28 (26 个 D.logfeer 的短期系数；以及 logreer 和 logfeer 的 2 个截面平均变量)。CD 测试的值为 48.4%。

接下来，本文在完成上面的估计之后，对横截面依赖性的强度进行了检验：

. xtcd2, pesaran cdw contour(abs) reps(50)

Testing for weak cross-sectional dependence (CSD)
   H0: weak cross-section dependence
   H1: strong cross-section dependence
--------------------------------------------
               |    CD            CDw
---------------+----------------------------
residuals      |    -0.70       -0.47
               |  (0.484)     (0.636)
--------------------------------------------
p-values in parenthesis.
References
  CD:        Pesaran (2015, 2021)
  CDw:       Juodis, Reese (2021)

这些结论在本质上与横截面平均值较少的结论相类似。想要了解更多关于实证检验的常规因素，请查看：Panel Data Econometrics: Common Factor Analysis for Empirical Researchers。

3. Jamel Saadaoui 个人简介

本文作者 Jamel Saadaoui 是一名社会科学领域的研究员，专注于与国际经济学相关的多个课题。Jamel 有时从宏观角度撰写关于概率论、哲学、经济学及其他主题的文章，最近对地缘政治局势与经济发展之间的相互作用尤为感兴趣。

自 2024 年起，Jamel 在巴黎第八大学担任经济学全职教授，并在 LED 实验室以及欧洲研究所从事科研活动。此外，Jamel 还被选为 2016 至 2024 年国家大学理事会的成员。Jamel 的简历和邮箱：[email protected] 。

4. 相关推文

Note：产生如下推文列表的 Stata 命令为：
lianxh 论文复现, m
安装最新版 lianxh 命令：
ssc install lianxh, replace

专题：论文重现

刘依云, 2023, 论文复现：土豆对人口与城市化的贡献-连续DID应用, 连享会 No.1190.
刘峒杉, 2024, 110篇经济学政治学论文复现代码、数据和复现报告, 连享会 No.1459.
连玉君, 陈鑫梅, 2020, 可重复性研究：如何保证你的研究结果可重现？, 连享会 No.124.
邹恬华, 2022, Stata论文复现：做一个优雅的码农, 连享会 No.928.
郭思媛, 2024, 论文复现时如何与原文作者沟通？, 连享会 No.1443.

专题：论文写作

刘帅, 2021, Stata论文复现：女性领导人当选是否有助于更多女性从政-RD, 连享会 No.831.

专题：倍分法DID

刘梦蝶, 2024, DID大餐：49 篇 QJE 论文汇总（2018-2022）, 连享会 No.1363.
刘淑云, 2023, 论文复现：低碳转型冲击就业吗？, 连享会 No.1197.
匡宇驰, 2024, 经济学家知错必改吗？AER复现类论文的影响, 连享会 No.1454.
吕卓阳, 2021, Stata 论文复现：Temperature Shocks and Economic Growth, 连享会 No.730.
吕卓阳, 2021, Stata 论文复现：儿童权利公约对儿童健康的影响, 连享会 No.726.
吕大兴, 2023, 论文复现：基于组级纵向数据评估政策的试验模拟方法, 连享会 No.1145.
吴奕玮, 2023, 论文复现：引入注意力的考虑集模型-alogit, 连享会 No.1316.
姚永健, 2023, 论文复现：家庭财富冲击会影响生产力吗？, 连享会 No.1185.

Stata：共同因子解读及 xtdcce2 命令应用

正文

1. 前言

2. 正文

3. Jamel Saadaoui 个人简介

4. 相关推文

请到「今天看啥」查看全文