ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

Simulated annealing based semi-supervised support vector machine for credit prediction

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2018.06.003
  • Received Date: 09 September 2017
  • Accepted Date: 10 April 2018
  • Rev Recd Date: 10 April 2018
  • Publish Date: 30 June 2018
  • In the mid-1990s financial institutions began to combine consumer and business information to create scores for business credits. Enterprises in China, especially small and micro enterprises, have less credit information, resulting in the situation where only a small number of enterprises have credit information, while a large number of enterprises have none. However, semi-supervised support vector machines (S3VM) can learn from labeled data and unlabeled data and solve the problems of imbalanced credit data categories and insufficient sample information. The parameters of S3VM have a great influence on the effect of the algorithm, and the actual parameter selection is often based on experience. An SAS3VM algorithm was proposed to optimize the parameters of deterministic annealing based semi-supervised support vector machine (DAS3VM) with simulated annealing. Based on the small number of labeled credit data, the algorithm takes advantage of the unlabeled credit data to help study and use the simulate annealing to find the optimal parameters. Experiments were conducted on two categories of enterprise credit data and three categories of personal credit data. The results show that semi-supervised learning (DAS3VM and SAS3VM) performs better than supervised learning. The maximum accuracy of SAS3VM has been increased by 13.108% compared with DAS3VM.
    In the mid-1990s financial institutions began to combine consumer and business information to create scores for business credits. Enterprises in China, especially small and micro enterprises, have less credit information, resulting in the situation where only a small number of enterprises have credit information, while a large number of enterprises have none. However, semi-supervised support vector machines (S3VM) can learn from labeled data and unlabeled data and solve the problems of imbalanced credit data categories and insufficient sample information. The parameters of S3VM have a great influence on the effect of the algorithm, and the actual parameter selection is often based on experience. An SAS3VM algorithm was proposed to optimize the parameters of deterministic annealing based semi-supervised support vector machine (DAS3VM) with simulated annealing. Based on the small number of labeled credit data, the algorithm takes advantage of the unlabeled credit data to help study and use the simulate annealing to find the optimal parameters. Experiments were conducted on two categories of enterprise credit data and three categories of personal credit data. The results show that semi-supervised learning (DAS3VM and SAS3VM) performs better than supervised learning. The maximum accuracy of SAS3VM has been increased by 13.108% compared with DAS3VM.
  • loading
  • [1]
    EISENBEIS R A. Recent developments in the application of credit-scoring technique to the evaliation of commercial loan[J]. IMA Journal of Management Mathematics, 1996, 7(4): 271-290.
    [2]
    BERGER A N, FRAME W S. Small business credit scoring and credit availability[J]. Journal of Small Business Management, 2007, 45(1): 5-22.
    [3]
    BERGER A N, UDELL G F. Small business credit availability and relationship lending: The importance of bank organizational structure[J]. The Economic Journal, 2002, 112(477): F32-F53.
    [4]
    HUANG C L, CHEN M C, WANG C J. Credit scoring with a data mining approach based on support vector machines[J]. Expert Systems With Applications,2007, 33(4): 847-856.
    [5]
    BELLOTTI T, CROOK J. Support vector machines for credit scoring and discovery of significant features[J]. Expert Systems with Applications, 2009, 36(2): 3302-3308.
    [6]
    SHAHSHAHANI B M, LANDGREBE D A. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon[J]. IEEE Transactions on Geoscience and Remote Sensing,1994, 32(5): 1087-1095.
    [7]
    NIGAM K, GHANI R. Analyzing the effectiveness and applicability of co-training[J]. Proceedings of the Ninth International Conference on Information and Knowledge Management. New York, NY, USA :ACM, 2000.
    [8]
    BALUJA S. Probabilistic modeling for face orientation discrimination: Learning from labeled and unlabeled data[C]// Proceedings of the 11th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 1998: 854-860.
    [9]
    BLUM A, CHAWLA S. Learning from labeled and unlabeled data using graph mincuts[C]// Proceedings of the Eighteenth international Conference on Machine Learning. San Francisco, CA, USA : Morgan Kaufmann Publishers Inc., 2001: 19-26.
    [10]
    ZHOU D Y, BOUSQUET O, LAL T N, et al. Learning with local and global consistency[C]// Proceedings of the 16th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 2003: 321-328.
    [11]
    WANG F, ZHANG C S. Label propagation through linear neighborhoods[J]. IEEE Transactions on Knowledge and Data Engineering,2008, 20(1) : 55-67.
    [12]
    BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[C]//Proceedings of the Eleventh Annual Conference on Computational Learning Theory. New York, NY, USA :ACM, 1998: 92-100.
    [13]
    ZHOU Z H, LI M. Tri-training: Exploiting unlabeled data using three classifiers[J]. IEEE Transactions on knowledge and Data Engineering, 2005, 17(11): 1529-1541.
    [14]
    JONES R. Learning to extract entities from labeled and unlabeled text[D]. Pittsburgh,PA,USA:Carnegie Mellon University, 2005.
    [15]
    JOACHIMS T. Transductive inference for text classification using support vector machines[C]// Proceedings of the Sixteenth International Conference on Machine Learning. San Francisco, CA, USA :Morgan Kaufmann Publishers Inc.,1999: 200-209.
    [16]
    CHAPELLE O, CHI M, ZIEN A. A continuation method for semi-supervised SVMs[C]// Proceedings of the 23rd International Conference on Machine Learning. New York, NY, USA: ACM, 2006:185-192.
    [17]
    SINDHWANI V, KEERTHI S S, CHAPELLE O. Deterministic annealing for semi-supervised kernel machines[C]// Proceedings of the Twenty-Third International Conference on Machine Learning. New York, NY, USA: ACM, 2006: 841- 848.
    [18]
    BELKIN M, NIYOGI P, SINDHWANI V. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples[J]. Journal of Machine Learning Research, 2006,7: 2399-2434.
    [19]
    LI Y F, KWOK J T, ZHOU Z H. Semi-supervised learning using label mean[C]// Proceedings of the 26th Annual International Conference on Machine Learning. New York, NY, USA:ACM, 2009:633-640.
    [20]
    ROSE K. Deterministic annealing for clustering, compression, classifi- cation, regression, and related optimization problems[J]. Proceedings of the IEEE, 1998, 86(11): 2210-2239.
    [21]
    BENNETT K P, DEMIRIZ A. Semi-supervised support vector machin-ES[C]// Proceedings of the 11th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 1998: 368-374.
    [22]
    CHAPELLE O, ZIEN A. Semi-supervised classification by low density separation[C]// Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005:57-64.
    [23]
    BERTSIMAS D, TSITSIKLIS J. Simulated annealing[J]. Statistical Science, 1993, 8(1): 10-15.
    [24]
    KEERTHI S S, DECOSTE D. A modified finite newton method for fast solution of large scale linear SVMs[J]. Journal of Machine Learning Research, 2005, 6: 341-361.
    [25]
    CHAPELLE O, SINDHWANI V, KEERTHI S S. Optimization techniques for semi-supervised support vector machines[J]. Journal of Machine Learning Research, 2008, 9: 203-233.
    [26]
    CHAPELLE O, SCHOLKOPF B, ZIEN A. Semi-supervised learning[J]. IEEE Transactions on Neural Networks, 2009, 20(3): 542-542.
    [27]
    SINDHWANI V, KEERTHI S S. Large scale semi-supervised linear SVMs[C]// Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA: ACM, 2006: 477-484.
    [28]
    MANGASARIAN O L. A finite newton method for classification[J]. Optimization Methods and Software, 2002, 17(5): 913-929.
    [29]
    DEMPSTER A P, LAIRD N M, RUBIN D B. Maximum likelihood from incomplete data via the EM algorithm[J]. Journal of the Royal Statistical Society: Series B (Methodological), 1977, 39(1):1-38.
    [30]
    ZHU X, GHAHRAMANI Z, LAFFERTY J. Semi-supervised learning using Gaussian fields and harmonic functions[C]// Twentieth International Conference on International Conference on Machine Learning. AAAI Press, 2003:912-919.)
  • 加载中

Catalog

    [1]
    EISENBEIS R A. Recent developments in the application of credit-scoring technique to the evaliation of commercial loan[J]. IMA Journal of Management Mathematics, 1996, 7(4): 271-290.
    [2]
    BERGER A N, FRAME W S. Small business credit scoring and credit availability[J]. Journal of Small Business Management, 2007, 45(1): 5-22.
    [3]
    BERGER A N, UDELL G F. Small business credit availability and relationship lending: The importance of bank organizational structure[J]. The Economic Journal, 2002, 112(477): F32-F53.
    [4]
    HUANG C L, CHEN M C, WANG C J. Credit scoring with a data mining approach based on support vector machines[J]. Expert Systems With Applications,2007, 33(4): 847-856.
    [5]
    BELLOTTI T, CROOK J. Support vector machines for credit scoring and discovery of significant features[J]. Expert Systems with Applications, 2009, 36(2): 3302-3308.
    [6]
    SHAHSHAHANI B M, LANDGREBE D A. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon[J]. IEEE Transactions on Geoscience and Remote Sensing,1994, 32(5): 1087-1095.
    [7]
    NIGAM K, GHANI R. Analyzing the effectiveness and applicability of co-training[J]. Proceedings of the Ninth International Conference on Information and Knowledge Management. New York, NY, USA :ACM, 2000.
    [8]
    BALUJA S. Probabilistic modeling for face orientation discrimination: Learning from labeled and unlabeled data[C]// Proceedings of the 11th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 1998: 854-860.
    [9]
    BLUM A, CHAWLA S. Learning from labeled and unlabeled data using graph mincuts[C]// Proceedings of the Eighteenth international Conference on Machine Learning. San Francisco, CA, USA : Morgan Kaufmann Publishers Inc., 2001: 19-26.
    [10]
    ZHOU D Y, BOUSQUET O, LAL T N, et al. Learning with local and global consistency[C]// Proceedings of the 16th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 2003: 321-328.
    [11]
    WANG F, ZHANG C S. Label propagation through linear neighborhoods[J]. IEEE Transactions on Knowledge and Data Engineering,2008, 20(1) : 55-67.
    [12]
    BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[C]//Proceedings of the Eleventh Annual Conference on Computational Learning Theory. New York, NY, USA :ACM, 1998: 92-100.
    [13]
    ZHOU Z H, LI M. Tri-training: Exploiting unlabeled data using three classifiers[J]. IEEE Transactions on knowledge and Data Engineering, 2005, 17(11): 1529-1541.
    [14]
    JONES R. Learning to extract entities from labeled and unlabeled text[D]. Pittsburgh,PA,USA:Carnegie Mellon University, 2005.
    [15]
    JOACHIMS T. Transductive inference for text classification using support vector machines[C]// Proceedings of the Sixteenth International Conference on Machine Learning. San Francisco, CA, USA :Morgan Kaufmann Publishers Inc.,1999: 200-209.
    [16]
    CHAPELLE O, CHI M, ZIEN A. A continuation method for semi-supervised SVMs[C]// Proceedings of the 23rd International Conference on Machine Learning. New York, NY, USA: ACM, 2006:185-192.
    [17]
    SINDHWANI V, KEERTHI S S, CHAPELLE O. Deterministic annealing for semi-supervised kernel machines[C]// Proceedings of the Twenty-Third International Conference on Machine Learning. New York, NY, USA: ACM, 2006: 841- 848.
    [18]
    BELKIN M, NIYOGI P, SINDHWANI V. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples[J]. Journal of Machine Learning Research, 2006,7: 2399-2434.
    [19]
    LI Y F, KWOK J T, ZHOU Z H. Semi-supervised learning using label mean[C]// Proceedings of the 26th Annual International Conference on Machine Learning. New York, NY, USA:ACM, 2009:633-640.
    [20]
    ROSE K. Deterministic annealing for clustering, compression, classifi- cation, regression, and related optimization problems[J]. Proceedings of the IEEE, 1998, 86(11): 2210-2239.
    [21]
    BENNETT K P, DEMIRIZ A. Semi-supervised support vector machin-ES[C]// Proceedings of the 11th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 1998: 368-374.
    [22]
    CHAPELLE O, ZIEN A. Semi-supervised classification by low density separation[C]// Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005:57-64.
    [23]
    BERTSIMAS D, TSITSIKLIS J. Simulated annealing[J]. Statistical Science, 1993, 8(1): 10-15.
    [24]
    KEERTHI S S, DECOSTE D. A modified finite newton method for fast solution of large scale linear SVMs[J]. Journal of Machine Learning Research, 2005, 6: 341-361.
    [25]
    CHAPELLE O, SINDHWANI V, KEERTHI S S. Optimization techniques for semi-supervised support vector machines[J]. Journal of Machine Learning Research, 2008, 9: 203-233.
    [26]
    CHAPELLE O, SCHOLKOPF B, ZIEN A. Semi-supervised learning[J]. IEEE Transactions on Neural Networks, 2009, 20(3): 542-542.
    [27]
    SINDHWANI V, KEERTHI S S. Large scale semi-supervised linear SVMs[C]// Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA: ACM, 2006: 477-484.
    [28]
    MANGASARIAN O L. A finite newton method for classification[J]. Optimization Methods and Software, 2002, 17(5): 913-929.
    [29]
    DEMPSTER A P, LAIRD N M, RUBIN D B. Maximum likelihood from incomplete data via the EM algorithm[J]. Journal of the Royal Statistical Society: Series B (Methodological), 1977, 39(1):1-38.
    [30]
    ZHU X, GHAHRAMANI Z, LAFFERTY J. Semi-supervised learning using Gaussian fields and harmonic functions[C]// Twentieth International Conference on International Conference on Machine Learning. AAAI Press, 2003:912-919.)

    Article Metrics

    Article views (56) PDF downloads(110)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return