ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

A multi-view based semi-supervised classifier with co-regularization for imbalanced data

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2020.05.007
  • Received Date: 14 April 2019
  • Accepted Date: 17 May 2019
  • Rev Recd Date: 17 May 2019
  • Publish Date: 31 May 2020
  • A method of constructing a multi-view semi-supervised learning classifier was presented for manifold learning and multi-puncture processing. The multi-view and semi-supervised learning of the data is achieved through recursive optimization, and appropriate labeling and equalization processing, until the efficiency of learning becomes stable. The properties of this multi-classifier were given, for instance, an upper bound of the generalization error, which showed a good capacity for generalization. Simulation and empirical analysis showed that the new method performs well with small samples.
    A method of constructing a multi-view semi-supervised learning classifier was presented for manifold learning and multi-puncture processing. The multi-view and semi-supervised learning of the data is achieved through recursive optimization, and appropriate labeling and equalization processing, until the efficiency of learning becomes stable. The properties of this multi-classifier were given, for instance, an upper bound of the generalization error, which showed a good capacity for generalization. Simulation and empirical analysis showed that the new method performs well with small samples.
  • loading
  • [1]
    ZHU X. Semi-supervised learning literature survey[R]. Madison, WI: Department of Computer Sciences, University of Wisconsin-Madison, 2005.
    [2]
    周志华.半监督学习中的协同训练风范[C]//机器学习及其应用.北京:清华大学出版社,2007.
    [3]
    BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[C]//COLT ’98: Proceedings of the Eleventh Annual Conference on Computational Learning Theory. New York: Association for Computing Machinery, 1998: 92-100.
    [4]
    SUN X Y, GONG D W, ZHANG W. Interactive genetic algorithms with large population and semi-supervised learning[J]. Applied Soft Computing Journal, 2012, 12(9): 3004-3013.
    [5]
    ZHOU D, BOUSQUET O, WESTON J, et al. Learning with local and global consistency[C]// NIPS ’03: Proceedings of the 16th International Conference on Neural Information Processing Systems. Cambridge, MA: MIT Press, 2003: 169-176.
    [6]
    BELKIN M, NIYOGI P, SINDHWANI V, et al. Manifold regularization: A geometric framework for learning from examples[J]. The Journal of Machine Learning Research, 2006, 7: 2399-2434.
    [7]
    OLIVER A, ODENA A, RAFFEL C, et al. Realistic evaluation of deep semi-supervised learning algorithms[C]// NIPS ’18: Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc, 2018: 3239-3250.
    [8]
    WESTON J, RATLE F, MOBAHI H, et al. Deep learning via semi-supervised embedding[C]// Neural Networks: Tricks of the Trade. Berlin: Springer, 2012: 639-655.
    [9]
    CALMA A, REITMAIER T, SICK B. Semi-supervised active learning for support vector machines: A novel approach that exploits structure information in data[J]. Information Sciences, 2018, 456: 13-33.
    [10]
    DRUGMAN T, PYLKKONEN J, KNESER R. Active and semi-supervised learning in ASR: Benefits on the acoustic and language models[DB/OL]. [2019-04-01]. https://arxiv.org/abs/1903.02852.
    [11]
    MUSLEA I, MINTON S, KNOBLOCK C A. Active + semi-supervised learning = robust multi-view learning[C]// ICML ’02: Proceedings of the Nineteenth International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann Publishers Inc, 2002: 435-442.
    [12]
    SUN S. A survey of multi-view machine learning[J]. Neural Computing and Applications, 2013, 23(7): 2031-2038.
    [13]
    HOTELLING H. Relations between two sets of variates[C]// Breakthroughs in Statistics. New York: Springer, 1992: 162-190.
    [14]
    SUN S. Multi-view Laplacian support vector machines[C]// Advanced Data Mining and Applications. Berlin: Springer, 2011: 209-222.
    [15]
    MINH H Q, BAZZANI L, MURINO V. A unifying framework in vector-valued reproducing kernel Hilbert spaces for manifold regularization and co-regularized multi-view learning[J]. The Journal of Machine Learning Research, 2016, 17(1): 769-840.
    [16]
    徐蓉,姜峰,姚鸿勋.流形学习概述[J].智能系统学报,2006,1(1): 44-51.
    [17]
    BROUARD C, D'ALCH-BUC F, SZAFRANSKI M. Semi-supervised penalized output kernel regression for link prediction. ICML ’11: Proceedings of the 28th International Conference on International Conference on Machine Learning. Madison,WI: Omnipress, 2011: 593-600.
    [18]
    MINH H Q, SINDHWANI V. Vector-valued manifold regularization[C]// ICML ’11: Proceedings of the 28th International Conference on Machine Learning. Madison, WI: Omnipress, 2011: 57-64.
    [19]
    MICCHELLI C A, PONTIL M. On learning vector-valued functions[J]. Neural Computation, 2005, 17(1):177-204.
    [20]
    MROUEH Y, POGGIO T, ROSASCO L, et al. Multiclass learning with simplex coding[C]// Advances in Neural Information Processing Systems 25 (NIPS 2012). Lake Tahoe, NEV: Neural Information Processing Systems Foundation, 2012.
    [21]
    VIAENE S, DERRIG R A, DEDENE G. Cost-sensitive learning and decision making for massachusetts pip claim fraud data[J]. International Journal of Intelligent Systems, 2004, 19(12): 1197-1215.
    [22]
    MAJID A, ALI S, IQBAL M, et al. Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines[J]. Comput Methods Programs Biomed, 2014, 113(3): 792-808.
    [23]
    WU T T, LANGE K. Multicategory vertex discriminant analysis for high-dimensional data[J]. Annals of Applied Statistics, 2010, 4(4): 1698-1721.
    [24]
    李晓刚.个人信用风险评估的一种基于XGBoost的集成学习方法[D].合肥:中国科学技术大学,2018.
    [25]
    BARTLETT P L, MENDELSON S. Rademacher and Gaussian complexities: Risk bounds and structural results[J]. The Journal of Machine Learning Research, 2002, 3: 463-482.
  • 加载中

Catalog

    [1]
    ZHU X. Semi-supervised learning literature survey[R]. Madison, WI: Department of Computer Sciences, University of Wisconsin-Madison, 2005.
    [2]
    周志华.半监督学习中的协同训练风范[C]//机器学习及其应用.北京:清华大学出版社,2007.
    [3]
    BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[C]//COLT ’98: Proceedings of the Eleventh Annual Conference on Computational Learning Theory. New York: Association for Computing Machinery, 1998: 92-100.
    [4]
    SUN X Y, GONG D W, ZHANG W. Interactive genetic algorithms with large population and semi-supervised learning[J]. Applied Soft Computing Journal, 2012, 12(9): 3004-3013.
    [5]
    ZHOU D, BOUSQUET O, WESTON J, et al. Learning with local and global consistency[C]// NIPS ’03: Proceedings of the 16th International Conference on Neural Information Processing Systems. Cambridge, MA: MIT Press, 2003: 169-176.
    [6]
    BELKIN M, NIYOGI P, SINDHWANI V, et al. Manifold regularization: A geometric framework for learning from examples[J]. The Journal of Machine Learning Research, 2006, 7: 2399-2434.
    [7]
    OLIVER A, ODENA A, RAFFEL C, et al. Realistic evaluation of deep semi-supervised learning algorithms[C]// NIPS ’18: Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc, 2018: 3239-3250.
    [8]
    WESTON J, RATLE F, MOBAHI H, et al. Deep learning via semi-supervised embedding[C]// Neural Networks: Tricks of the Trade. Berlin: Springer, 2012: 639-655.
    [9]
    CALMA A, REITMAIER T, SICK B. Semi-supervised active learning for support vector machines: A novel approach that exploits structure information in data[J]. Information Sciences, 2018, 456: 13-33.
    [10]
    DRUGMAN T, PYLKKONEN J, KNESER R. Active and semi-supervised learning in ASR: Benefits on the acoustic and language models[DB/OL]. [2019-04-01]. https://arxiv.org/abs/1903.02852.
    [11]
    MUSLEA I, MINTON S, KNOBLOCK C A. Active + semi-supervised learning = robust multi-view learning[C]// ICML ’02: Proceedings of the Nineteenth International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann Publishers Inc, 2002: 435-442.
    [12]
    SUN S. A survey of multi-view machine learning[J]. Neural Computing and Applications, 2013, 23(7): 2031-2038.
    [13]
    HOTELLING H. Relations between two sets of variates[C]// Breakthroughs in Statistics. New York: Springer, 1992: 162-190.
    [14]
    SUN S. Multi-view Laplacian support vector machines[C]// Advanced Data Mining and Applications. Berlin: Springer, 2011: 209-222.
    [15]
    MINH H Q, BAZZANI L, MURINO V. A unifying framework in vector-valued reproducing kernel Hilbert spaces for manifold regularization and co-regularized multi-view learning[J]. The Journal of Machine Learning Research, 2016, 17(1): 769-840.
    [16]
    徐蓉,姜峰,姚鸿勋.流形学习概述[J].智能系统学报,2006,1(1): 44-51.
    [17]
    BROUARD C, D'ALCH-BUC F, SZAFRANSKI M. Semi-supervised penalized output kernel regression for link prediction. ICML ’11: Proceedings of the 28th International Conference on International Conference on Machine Learning. Madison,WI: Omnipress, 2011: 593-600.
    [18]
    MINH H Q, SINDHWANI V. Vector-valued manifold regularization[C]// ICML ’11: Proceedings of the 28th International Conference on Machine Learning. Madison, WI: Omnipress, 2011: 57-64.
    [19]
    MICCHELLI C A, PONTIL M. On learning vector-valued functions[J]. Neural Computation, 2005, 17(1):177-204.
    [20]
    MROUEH Y, POGGIO T, ROSASCO L, et al. Multiclass learning with simplex coding[C]// Advances in Neural Information Processing Systems 25 (NIPS 2012). Lake Tahoe, NEV: Neural Information Processing Systems Foundation, 2012.
    [21]
    VIAENE S, DERRIG R A, DEDENE G. Cost-sensitive learning and decision making for massachusetts pip claim fraud data[J]. International Journal of Intelligent Systems, 2004, 19(12): 1197-1215.
    [22]
    MAJID A, ALI S, IQBAL M, et al. Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines[J]. Comput Methods Programs Biomed, 2014, 113(3): 792-808.
    [23]
    WU T T, LANGE K. Multicategory vertex discriminant analysis for high-dimensional data[J]. Annals of Applied Statistics, 2010, 4(4): 1698-1721.
    [24]
    李晓刚.个人信用风险评估的一种基于XGBoost的集成学习方法[D].合肥:中国科学技术大学,2018.
    [25]
    BARTLETT P L, MENDELSON S. Rademacher and Gaussian complexities: Risk bounds and structural results[J]. The Journal of Machine Learning Research, 2002, 3: 463-482.

    Article Metrics

    Article views (122) PDF downloads(418)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return