ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Research Articles

A robust homogeneity pursuit algorithm for varying coefficient models with longitudinal data

Cite this:
https://doi.org/10.52396/JUST-2021-0054
  • Received Date: 22 February 2021
  • Rev Recd Date: 20 March 2021
  • Publish Date: 31 December 2021
  • This article explores the homogeneity of coefficient functions in varying coefficient models where individuals can be classified into different subgroups for each covariate where its varying coefficients are homogeneous in the same subgroup. With repeated measurements, we use B-spline function approximations and the change point detection algorithm to identify the homogeneity. To account for the potential outliers or heavy-tailedness of the observed distribution, we propose to estimate the coefficient functions under the framework of M-estimation, and use least absolute deviation (LAD) loss as an example. Numerical results show that our estimators outperform the commonly used least squares (LS) estimators when existing outliers and heavy-tailedness of observed distribution.
    This article explores the homogeneity of coefficient functions in varying coefficient models where individuals can be classified into different subgroups for each covariate where its varying coefficients are homogeneous in the same subgroup. With repeated measurements, we use B-spline function approximations and the change point detection algorithm to identify the homogeneity. To account for the potential outliers or heavy-tailedness of the observed distribution, we propose to estimate the coefficient functions under the framework of M-estimation, and use least absolute deviation (LAD) loss as an example. Numerical results show that our estimators outperform the commonly used least squares (LS) estimators when existing outliers and heavy-tailedness of observed distribution.
  • loading
  • [1]
    Lian H, Qiao X, Zhang W. Homogeneity pursuit in single index models based panel data analysis. Journal of Business & Economic Statistics, 2019, 39(2): 386-401.
    [2]
    Wang H, Xia Y. Shrinkage estimation of the varying coefficient model. Journal of the American Statistical Association, 2009, 104: 747-757.
    [3]
    Xue L, Zhu L. Empirical likelihood for a varying coefficient model with longitudinal data. Journal of the American Statistical Association, 2007, 102: 642-654.
    [4]
    Park M Y, Hastie T, Tibshirani R. Averaged gene expressions for regression. Biostatistics, 2007, 8(2): 212-227.
    [5]
    Friedman J, Hastie T, Höfling H, et al. Pathwise coordinate optimization. The Annals of Applied Statistics, 2007, 1(2): 302-332.
    [6]
    Tibshirani R, Saunders M, Rosset S, et al. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2005, 67(1): 91-108.
    [7]
    Bondell H D, Reich B J. Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics, 2008, 64(1): 115-123.
    [8]
    Zhu Y, Shen X, Pan W. Simultaneous grouping pursuit and feature selection over an undirected graph. Journal of the American Statistical Association, 2013, 108: 713-725.
    [9]
    Yang Y, He X. Bayesian empirical likelihood for quantile regression. The Annals of Statistics, 2012, 40(2): 1102-1131.
    [10]
    Ke Z T, Fan J, Wu Y. Homogeneity pursuit. Journal of the American Statistical Association, 2015, 110: 175-194.
    [11]
    Wang W, Phillips P C, Su L. Homogeneity pursuit in panel data models: Theory and application. Journal of Applied Econometrics, 2018, 33(6): 797-815.
    [12]
    Ke Y, Li J, Zhang W, et al. Structure identification in panel data analysis. The Annals of Statistics, 2016, 44(3): 1193-1233.
    [13]
    Li J, Yue M, Zhang W. Subgroup identification via homogeneity pursuit for dense longitudinal/spatial data. Statistics in Medicine, 2019, 38: 3256-3271.
    [14]
    Li F, Sang H. Spatial homogeneity pursuit of regression coefficients for large datasets. Journal of the American Statistical Association, 2019, 114: 1050-1062.
    [15]
    Wu C O, Chiang C T, Hoover D R. Asymptotic confidence regions for kernel smoothing of a varying-coefficient model with longitudinal data. Journal of the American Statistical Association, 1998, 93: 1388-1402.
    [16]
    Cai Z, Fan J, Li R. Efficient estimation and inferences for varying-coefficient models. Journal of the American Statistical Association, 2000, 95: 888-902.
    [17]
    Sun Y, Carroll R J, Li D. Semiparametric estimation of fixed-effects panel data varying coefficient models. In: Nonparametric Econometric Methods. Bingley, UK: Emerald Group Publishing Limited, 2009.
    [18]
    Li D, Chen J, Gao J. Non-parametric time-varying coefficient panel data models with fixed effects. The Econometrics Journal, 2011, 14(3): 387-408.
    [19]
    Rodriguez-Poo J M, Soberon A. Direct semi-parametric estimation of fixed effects panel data varying coefficient models. The Econometrics Journal, 2014, 17(1): 107-138.
    [20]
    Li D, Ke Y, Zhang W, et al. Model selection and structure specification in ultra-high dimensional generalised semi-varying coefficient models. The Annals of Statistics, 2015, 43(6): 2676-2705.
    [21]
    Tang Q, Wang J. L1-estimation for varying coefficient models. Statistics, 2005, 39(5): 389-404.
    [22]
    Zhang R, Zhao W, Liu J. Robust estimation and variable selection for semiparametric partially linear varying coefficient model based on modal regression. Journal of Nonparametric Statistics, 2013, 25(2): 523-544.
    [23]
    Yang H, Lv J, Guo C. Robust estimation and variable selection for varying-coefficient single-index models based on modal regression. Communications in Statistics: Theory and Methods, 2016, 45(14): 4048-4067.
    [24]
    Jiang Y, Ji Q, Xie B. Robust estimation for the varying coefficient partially nonlinear models. Journal of Computational and Applied Mathematics, 2017, 326: 31-43.
    [25]
    Tang Q, Cheng L. M-estimation and B-spline approximation for varying coefficient models with longitudinal data. Journal of Nonparametric Statistics, 2008, 20(7): 611-625.
    [26]
    Rousseeuw P J, Leroy A M. Robust Regression and Outlier Detection. Hoboken, NJ: Wiley, 2005.
    [27]
    Ling S. Self-weighted least absolute deviation estimation for in finite variance autoregressive models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2005, 67(3): 381-393.
    [28]
    He X, Zhu Z Y, Fung W K. Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika, 2002, 89(3): 579-590.
    [29]
    Honda T. Quantile regression in varying coefficient models. Journal of Statistical Planning and Inference, 2004, 121(1): 113-125.
    [30]
    Cho H, Fryzlewicz P. Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. Journal of the Royal Statistical Society: Series B: Statistical Methodology, 2015, 77(2): 475-507.
    [31]
    Fryzlewicz P. Wild binary segmentation for multiple change-point detection. Annals of Statistics, 2014, 42(6): 2243-2281.
    [32]
    Fan J, Feng Y, Song R. Nonparametric independence screening in sparse ultra-high-dimensional additive models. Journal of the American Statistical Association, 2011, 106: 544-557.
    [33]
    Huang J, Horowitz J L, Wei F. Variable selection in nonparametric additive models. Annals of Statistics, 2010, 38(4): 2282-2313.
  • 加载中

Catalog

    [1]
    Lian H, Qiao X, Zhang W. Homogeneity pursuit in single index models based panel data analysis. Journal of Business & Economic Statistics, 2019, 39(2): 386-401.
    [2]
    Wang H, Xia Y. Shrinkage estimation of the varying coefficient model. Journal of the American Statistical Association, 2009, 104: 747-757.
    [3]
    Xue L, Zhu L. Empirical likelihood for a varying coefficient model with longitudinal data. Journal of the American Statistical Association, 2007, 102: 642-654.
    [4]
    Park M Y, Hastie T, Tibshirani R. Averaged gene expressions for regression. Biostatistics, 2007, 8(2): 212-227.
    [5]
    Friedman J, Hastie T, Höfling H, et al. Pathwise coordinate optimization. The Annals of Applied Statistics, 2007, 1(2): 302-332.
    [6]
    Tibshirani R, Saunders M, Rosset S, et al. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2005, 67(1): 91-108.
    [7]
    Bondell H D, Reich B J. Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics, 2008, 64(1): 115-123.
    [8]
    Zhu Y, Shen X, Pan W. Simultaneous grouping pursuit and feature selection over an undirected graph. Journal of the American Statistical Association, 2013, 108: 713-725.
    [9]
    Yang Y, He X. Bayesian empirical likelihood for quantile regression. The Annals of Statistics, 2012, 40(2): 1102-1131.
    [10]
    Ke Z T, Fan J, Wu Y. Homogeneity pursuit. Journal of the American Statistical Association, 2015, 110: 175-194.
    [11]
    Wang W, Phillips P C, Su L. Homogeneity pursuit in panel data models: Theory and application. Journal of Applied Econometrics, 2018, 33(6): 797-815.
    [12]
    Ke Y, Li J, Zhang W, et al. Structure identification in panel data analysis. The Annals of Statistics, 2016, 44(3): 1193-1233.
    [13]
    Li J, Yue M, Zhang W. Subgroup identification via homogeneity pursuit for dense longitudinal/spatial data. Statistics in Medicine, 2019, 38: 3256-3271.
    [14]
    Li F, Sang H. Spatial homogeneity pursuit of regression coefficients for large datasets. Journal of the American Statistical Association, 2019, 114: 1050-1062.
    [15]
    Wu C O, Chiang C T, Hoover D R. Asymptotic confidence regions for kernel smoothing of a varying-coefficient model with longitudinal data. Journal of the American Statistical Association, 1998, 93: 1388-1402.
    [16]
    Cai Z, Fan J, Li R. Efficient estimation and inferences for varying-coefficient models. Journal of the American Statistical Association, 2000, 95: 888-902.
    [17]
    Sun Y, Carroll R J, Li D. Semiparametric estimation of fixed-effects panel data varying coefficient models. In: Nonparametric Econometric Methods. Bingley, UK: Emerald Group Publishing Limited, 2009.
    [18]
    Li D, Chen J, Gao J. Non-parametric time-varying coefficient panel data models with fixed effects. The Econometrics Journal, 2011, 14(3): 387-408.
    [19]
    Rodriguez-Poo J M, Soberon A. Direct semi-parametric estimation of fixed effects panel data varying coefficient models. The Econometrics Journal, 2014, 17(1): 107-138.
    [20]
    Li D, Ke Y, Zhang W, et al. Model selection and structure specification in ultra-high dimensional generalised semi-varying coefficient models. The Annals of Statistics, 2015, 43(6): 2676-2705.
    [21]
    Tang Q, Wang J. L1-estimation for varying coefficient models. Statistics, 2005, 39(5): 389-404.
    [22]
    Zhang R, Zhao W, Liu J. Robust estimation and variable selection for semiparametric partially linear varying coefficient model based on modal regression. Journal of Nonparametric Statistics, 2013, 25(2): 523-544.
    [23]
    Yang H, Lv J, Guo C. Robust estimation and variable selection for varying-coefficient single-index models based on modal regression. Communications in Statistics: Theory and Methods, 2016, 45(14): 4048-4067.
    [24]
    Jiang Y, Ji Q, Xie B. Robust estimation for the varying coefficient partially nonlinear models. Journal of Computational and Applied Mathematics, 2017, 326: 31-43.
    [25]
    Tang Q, Cheng L. M-estimation and B-spline approximation for varying coefficient models with longitudinal data. Journal of Nonparametric Statistics, 2008, 20(7): 611-625.
    [26]
    Rousseeuw P J, Leroy A M. Robust Regression and Outlier Detection. Hoboken, NJ: Wiley, 2005.
    [27]
    Ling S. Self-weighted least absolute deviation estimation for in finite variance autoregressive models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2005, 67(3): 381-393.
    [28]
    He X, Zhu Z Y, Fung W K. Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika, 2002, 89(3): 579-590.
    [29]
    Honda T. Quantile regression in varying coefficient models. Journal of Statistical Planning and Inference, 2004, 121(1): 113-125.
    [30]
    Cho H, Fryzlewicz P. Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. Journal of the Royal Statistical Society: Series B: Statistical Methodology, 2015, 77(2): 475-507.
    [31]
    Fryzlewicz P. Wild binary segmentation for multiple change-point detection. Annals of Statistics, 2014, 42(6): 2243-2281.
    [32]
    Fan J, Feng Y, Song R. Nonparametric independence screening in sparse ultra-high-dimensional additive models. Journal of the American Statistical Association, 2011, 106: 544-557.
    [33]
    Huang J, Horowitz J L, Wei F. Variable selection in nonparametric additive models. Annals of Statistics, 2010, 38(4): 2282-2313.

    Article Metrics

    Article views (953) PDF downloads(3725)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return