《正文》
R
软件中的计量经济学程序包
Base R ships with a lot of functionality useful for computational econometrics, in particular in the stats package. This functionality is complemented by many packages on CRAN, a brief overview is given below.
There is also a considerable overlap between the tools for econometrics in this view and those in the task views on
Finance
,
SocialSciences
, and
TimeSeries
. Furthermore, the
Finance SIG
is a suitable mailing list for obtaining help and discussing questions about both computational finance and econometrics.
The packages in this view can be roughly structured into the following topics. If you think that some package is missing from the list, please contact the maintainer.
Basic linear regression
(
基础的线性回归
)
·
Estimation and standard inference
(
估计和标准推断
)
: Ordinary least squares (OLS) estimation for linear models is provided by
lm()
(from stats) and standard tests for model comparisons are available in various methods such as
summary()
and
anova()
.
·
Further inference and nested model comparisons
(
进一步推断和嵌套模型比较
)
: Functions analogous to the basic
summary()
and
anova()
methods that also support asymptotic tests (
z
instead of
t
tests, and Chi-squared instead of
F
tests) and plug-in of other covariance matrices are
coeftest()
and
waldtest()
in
lmtest
. Tests of more general linear hypotheses are implemented in
linearHypothesis()
and for nonlinear hypotheses in
deltaMethod()
in
car
.
·
Robust standard errors
(
稳健标准误
)
: HC and HAC covariance matrices are available in
sandwich
and can be plugged into the inference functions mentioned above.
·
Nonnested model comparisons
(
非嵌套模型比较
)
: Various tests for comparing non-nested linear models are available in
lmtest
(encompassing test, J test, Cox test). The Vuong test for comparing other non-nested models is provided by
nonnest2
(and specifically for count data regression in
pscl
).
·
Diagnost checking
: The packages
car
and
lmtest
provide a large collection of regression diagonstics and diagnostic tests.
Microeconometrics
(
微观计量经济学
)
·
Generalized linear models (GLMs)
(
广义线性模型
)
: Many standard microeconometric models belong to the family of generalized linear models and can be fitted by
glm()
from package stats. This includes in particular logit and probit models for modeling choice data and Poisson models for count data. Effects for typical values of regressors in these models can be obtained and visualized using
effects
. Marginal effects tables for certain GLMs can be obtained using the
mfx
and
margins
packages. Interactive visualizations of both effects and marginal effects are possible in
LinRegInteractive
.
·
Binary responses
(
二值响应
)
: The standard logit and probit models (among many others) for binary responses are GLMs that can be estimated by
glm()
with
family = binomial
. Bias-reduced GLMs that are robust to complete and quasi-complete separation are provided by
brglm
. Discrete choice models estimated by simulated maximum likelihood are implemented in
Rchoice
. Heteroscedastic probit models (and other heteroscedastic GLMs) are implemented in
glmx
along with parametric link functions and goodness-of-link tests for GLMs.
·
Count responses
(
数值响应
)
: The basic Poisson regression is a GLM that can be estimated by
glm()
with
family = poisson
as explained above. Negative binomial GLMs are available via
glm.nb()
in package
MASS
. Another implementation of negative binomial models is provided by
aod
, which also contains other models for overdispersed data. Zero-inflated and hurdle count models are provided in package
pscl
. A reimplementation by the same authors is currently under development in
countreg
on R-Forge which also encompasses separate functions for zero-truncated regression, finite mixture models etc.
·
Multinomial responses
(
多值响应
)
: Multinomial models with individual-specific covariates only are available in
multinom()
from package
nnet
. Implementations with both individual- and choice-specific variables are
mlogit
and
mnlogit
. Generalized multinomial logit models (e.g., with random effects etc.) are in
gmnl
. Generalized additive models (GAMs) for multinomial responses can be fitted with the
VGAM
package. A Bayesian approach to multinomial probit models is provided by
MNP
. Various Bayesian multinomial models (including logit and probit) are available in
bayesm
. Furthermore, the package
RSGHB
fits various hierarchical Bayesian specifications based on direct specification of the likelihood function.
·
Ordered responses
(
排序响应
)
: Proportional-odds regression for ordered responses is implemented in
polr()
from package
MASS
. The package
ordinal
provides cumulative link models for ordered data which encompasses proportional odds models but also includes more general specifications. Bayesian ordered probit models are provided by
bayesm
.
·
Censored responses
(
删失响应
)
: Basic censored regression models (e.g., tobit models) can be fitted by
survreg()
in
survival
, a convenience interface
tobit()
is in package
AER
. Further censored regression models, including models for panel data, are provided in
censReg
. Interval regression models are in
intReg
. Censored regression models with conditional heteroscedasticity are in
crch
. Furthermore, hurdle models for left-censored data at zero can be estimated with
mhurdle
. Models for sample selection are available in
sampleSelection
and semiparametric extensions of these are provided by
SemiParSampleSel
. Package
matchingMarkets
corrects for selection bias when the sample is the result of a stable matching process (e.g., a group formation or college admissions problem).
·
Truncated responses
(
截断响应
)
:
crch
for truncated (and potentially heteroscedastic) Gaussian, logistic, and t responses. Homoscedastic Gaussian responses are also available in
truncreg
.
·
Fraction and proportion responses
: Fractional response models are in
frm
. Beta regression for responses in (0, 1) is in
betareg
and
gamlss
.
·
Miscellaneous
(
其他
)
: Further more refined tools for microeconometrics are provided in the
micEcon
family of packages: Analysis with Cobb-Douglas, translog, and quadratic functions is in
micEcon
; the constant elasticity of scale (CES) function is in
micEconCES
; the symmetric normalized quadratic profit (SNQP) function is in
micEconSNQP
. The almost ideal demand system (AIDS) is in
micEconAids
. Stochastic frontier analysis (SFA) is in
frontier
and certain special cases also in
sfa
. Semiparametric SFA in is available in
semsfa
and spatial SFA in
spfrontier
and
ssfa
. The package
bayesm
implements a Bayesian approach to microeconometrics and marketing. Estimation and marginal effect computations for multivariate probit models can be carried out with
mvProbit
. Inference for relative distributions is contained in package
reldist
.
Instrumental variables
(
工具变量
)
·
Basic instrumental variables (IV) regression
(
基础工具变量回归
)
: Two-stage least squares (2SLS) is provided by
ivreg()
in
AER
. Other implementations are in
tsls()
in package
sem
, in
ivpack
, and
lfe
(with particular focus on multiple group fixed effects).
·
Binary responses
(
二值响应
)
: An IV probit model via GLS estimation is available in
ivprobit
. The
LARF
package estimates local average response functions for binary treatments and binary instruments.
·
Panel data
(
面板数据
)
: Certain basic IV models for panel data can also be estimated with standard 2SLS functions (see above). Dedicated IV panel data models are provided by
ivfixed
(fixed effects) and
ivpanel
(between and random effects).
·
Miscellaneous
(
其他
)
:
REndo
fits linear models with endogenous regressor using various latent instrumental variable approaches.
ivbma
estimates Bayesian IV models with conditional Bayes factors.
ivlewbel
implements the Lewbel approach based on GMM estimation of triangular systems using heteroscedasticity-based IVs.
Panel data models
(
面板数据模型
)
·
Panel-corrected standard errors
(
面板修正的标准误
)
: A simple approach for panel data is to fit the pooling (or independence) model (e.g., via
lm()
or
glm()
) and only correct the standard errors. Different types of panel-corrected standard errors are available in
multiwayvcov
,
clusterSEs
,
pcse
,
clubSandwich
,
plm
, and
geepack
, respectively. The latter two require estimation of the pooling/independence models via
plm()
and
geeglm()
from the respective packages (which also provide other types of models, see below).
·
Linear panel models
(
线性面板模型
)
:
plm
, providing a wide range of within, between, and random-effect methods (among others) along with corrected standard errors, tests, etc. Another implementation of several of these models is in
Paneldata
. Various dynamic panel models are available in
plm
and dynamic panel models with fixed effects in
OrthoPanels
.
·
Generalized estimation equations and GLMs
(
广义估计方程和广义线性模型
)
: GEE models for panel data (or longitudinal data in statistical jargon) are in
geepack
. The
pglm
package provides estimation of GLM-like models for panel data.
·
Mixed effects models
(
混合效应模型
)
: Linear and nonlinear models for panel data (and more general multi-level data) are available in
lme4
and
nlme
.
·
Instrumental variables
(
工具变量
)
:
ivfixed
and
ivpanel
, see also above.
·
Heterogeneous time trends
(
差异时间趋势
)
:
phtt
offers the possibility of analyzing panel data with large dimensions n and T and can be considered when the unobserved heterogeneity effects are time-varying.
·
Miscellaneous
(
其他
)
: Multiple group fixed effects are in
lfe
. Autocorrelation and heteroscedasticity correction in are available in
wahc
and
panelAR
. PANIC Tests of nonstationarity are in
PANICr
. Threshold regression and unit root tests are in
pdR
. The panel data approach method for program evaluation is available in
pampe
.
Further regression models
(
进一步回归模型
)
·
Nonlinear least squares modeling
(
非线性最小二乘模型
)
:
nls()
in package stats.
·
Quantile regression
(
分位数回归
)
:
quantreg
(including linear, nonlinear, censored, locally polynomial and additive quantile regressions).
·
Generalized method of moments (GMM) and generalized empirical likelihood (GEL)
(
广义矩估计方法和广义经验似然估计
)
:
gmm
.
·
Spatial econometric models
(
空间计量模型
)
: The
Spatial
view gives details about handling spatial data, along with information about (regression) modeling. In particular, spatial regression models can be fitted using
spdep
and
sphet
(the latter using a GMM approach).
splm
is a package for spatial panel models. Spatial probit models are available in
spatialprobit
.
·
Bayesian model averaging (BMA)
(
贝叶斯模型平均
)
: A comprehensive toolbox for BMA is provided by
BMS
including flexible prior selection, sampling, etc. A different implementation is in
BMA
for linear models, generalizable linear models and survival models (Cox regression).
·
Linear structural equation model
s
(
线性解构方程模型
)
:
lavaan
and
sem
. See also the
Psychometrics
task view for more details.
·
Simultaneous equation estimation
(
联立方程估计
)
:
systemfit
.
·
Nonparametric kernel methods
(
非参数核方法
)
:
np
.
·
Linear and nonlinear mixed-effect models
(
线性核非线性混合效应模型
)
:
nlme
and
lme4
.
·
Generalized additive models (GAMs)
(
广义加性模型
)
:
mgcv
,
gam
,
gamlss
and
VGAM
.
·
Extreme bounds analysis
(
极值边界分析
)
:
ExtremeBounds
.
·
Miscellaneous
(
其他
)
: The packages
VGAM
,
rms
and
Hmisc
provide several tools for extended handling of (generalized) linear regression models.
Zelig
is a unified easy-to-use interface to a wide range of regression models.
Time series data and models
(
时间序列数据和模型
)
·
The
TimeSeries
task view provides much more detailed information about both basic time series infrastructure and time series models. Here, only the most important aspects relating to econometrics are briefly mentioned. Time series models for financial econometrics (e.g., GARCH, stochastic volatility models, or stochastic differential equations, etc.) are described in the
Finance
task view.
·
Infrastructure for regularly spaced time series
(规则间隔时间序列的基础设施
)
: The class
"ts"
in package stats is R's standard class for regularly spaced time series (especially annual, quarterly, and monthly data). It can be coerced back and forth without loss of information to
"zooreg"
from package
zoo
.
·
Infrastructure for irregularly spaced time series
(不规则间隔时间序列的基础设施
)
:
zoo
provides infrastructure for both regularly and irregularly spaced time series (the latter via the class
"zoo"
) where the time information can be of arbitrary class. This includes daily series (typically with
"Date"
time index) or intra-day series (e.g., with
"POSIXct"
time index). An extension based on
zoo
geared towards time series with different kinds of time index is
xts
. Further packages aimed particularly at finance applications are discussed in the
Finance
task view.
·
Classical time series models
(
经典时间序列模型
)
: Simple autoregressive models can be estimated with
ar()
and ARIMA modeling and Box-Jenkins-type analysis can be carried out with
arima()
(both in the stats package). An enhanced version of
arima()
is in
forecast
.
·
Linear regression models
(
线性回归模型
)
: A convenience interface to
lm()
for estimating OLS and 2SLS models based on time series data is
dynlm
. Linear regression models with AR error terms via GLS is possible using
gls()
from
nlme
.
·
Structural time series models
(
结构时间序列模型
)
: Standard models can be fitted with
StructTS()
in stats. Further packages are discussed in the
TimeSeries
task view.
·
Filtering and decomposition
(
筛选和分解
)
:
decompose()
and
HoltWinters()
in stats. The basic function for computing filters (both rolling and autoregressive) is
filter()
in stats. Many extensions to these methods, in particular for forecasting and model selection, are provided in the
forecast
package.
·
Vector autoregression
(
向量自回归
)
: Simple models can be fitted by
ar()
in stats, more elaborate models are provided in package
vars
along with suitable diagnostics, visualizations etc. A Bayesian approach is available in
MSBVAR
.
·
Unit root and cointegration tests
(
单位根和协整检验
)
:
urca
,
tseries
,
CADFtest
. See also
pco
for panel cointegration tests.
·
Miscellaneous
(
其他
)
:
o
tsDyn
- Threshold and smooth transistion models.
o
midasr
-
MIDAS regression
and other econometric methods for mixed frequency time series data analysis.
o
gets
- GEneral-To-Specific (GETS) model selection for either ARX models with log-ARCH-X errors, or a log-ARCH-X model of the log variance.
o
tsfa
- Time series factor analysis.
o
dlsem
- Distributed-lag linear structural equation models.
o
apt
- Asymmetric price transmission models.
Data sets
(
数据集
)
·
Textbooks and journals
(
教科书和期刊
)
: Packages
AER
,
Ecdat
, and
wooldridge
contain a comprehensive collections of data sets from various standard econometric textbooks as well as several data sets from the Journal of Applied Econometrics and the Journal of Business & Economic Statistics data archives.
AER
and
wooldridge
additionally provide extensive sets of examples reproducing analyses from the textbooks/papers, illustrating various econometric methods.
·
Canadian monetary aggregates
(
加拿大货币总计
)
:
CDNmoney
.
·
Penn World Table
(佩恩表)
:
pwt
provides versions 5.6, 6.x, 7.x. Version 8.x and 9.x data are available in
pwt8
and
pwt9
, respectively.
·
Time series and forecasting data
(
时间序列和预测数据
)
: The packages
expsmooth
,
fma
, and
Mcomp
are data packages with time series data from the books 'Forecasting with Exponential Smoothing: The State Space Approach' (Hyndman, Koehler, Ord, Snyder, 2008, Springer) and 'Forecasting: Methods and Applications' (Makridakis, Wheelwright, Hyndman, 3rd ed., 1998, Wiley) and the M-competitions, respectively.
·
Empirical Research in Economics
(
经济学实证研究
)
: Package
erer
contains functions and datasets for the book of 'Empirical Research in Economics: Growing up with R' (Sun, forthcoming).
·
Panel Study of Income Dynamics (PSID)
(
收入动态追踪面板数据
)
:
psidR
can build panel data sets from the Panel Study of Income Dynamics (PSID).
·
US state- and county-level panel data
(美国州和县级面板数据)
:
rUnemploymentData
.
·
World Bank data and statistics
(世界银行数据和统计)
: The
wbstats
package provides programmatic access to the World Bank API.
Miscellaneous
(
其他
)
·
Matrix manipulations
(
矩阵操作
)
: As a vector- and matrix-based language, base R ships with many powerful tools for doing matrix manipulations, which are complemented by the packages
Matrix
and
SparseM
.
·
Optimization and mathematical programming
(
优化和数学编程
)
: R and many of its contributed packages provide many specialized functions for solving particular optimization problems, e.g., in regression as discussed above. Further functionality for solving more general optimization problems, e.g., likelihood maximization, is discussed in the the
Optimization
task view.
·
Bootstrap
(
自助法
)
: In addition to the recommended
boot
package, there are some other general bootstrapping techniques available in
bootstrap
or
simpleboot
as well some bootstrap techniques designed for time-series data, such as the maximum entropy bootstrap in
meboot
or the
tsbootstrap()
from
tseries
.
·
Inequality
(
不平等
)
: For measuring inequality, concentration and poverty the package
ineq
provides some basic tools such as Lorenz curves, Pen's parade, the Gini coefficient and many more.
·
Structural change
(
结构突变
)
: R is particularly strong when dealing with structural changes and changepoints in parametric models, see
strucchange
and
segmented
.
·
Exchange rate regimes
(
汇率制度
)
: Methods for inference about exchange rate regimes, in particular in a structural change setting, are provided by
fxregime
.
·
Global value chains
(
全球价值链
)
: Tools and decompositions for global value chains are in
gvc
and
decompr
.
·
Regression discontinuity design
(
断点回归设计
)
: A variety of methods are provided in the
rdd
,
rddtools
,
rdrobust
, and
rdlocrand
packages.
CRAN packages:
·
AER
(core)
·
aod
·
apt
·
bayesm
·
betareg
·
BMA
·
BMS
·
boot
·
bootstrap
·
brglm
·
CADFtest
·
car
(core)
·
CDNmoney
·
censReg
·
clubSandwich
·
clusterSEs
·
crch
·
decompr
·
dlsem
·
dynlm
·
Ecdat
·
effects
·
erer
·
expsmooth
·
ExtremeBounds
·
fma
·
forecast
(core)
·
frm
·
frontier