ARTICLE

Volume 1,Issue 4

Fall 2025

Cite this article
4
Download
1
Citations
11
Views
20 June 2025

基于LightGBM模型的信贷违约概率预测研究

乐乐 黄1 林 陈1
Show Less
1 暨南大学管理学院, 中国
ASDS 2025 , 1(4), 73–75; https://doi.org/10.61369/ASDS.2025040019
© 2025 by the Author. Licensee Art and Design, USA. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution -Noncommercial 4.0 International License (CC BY-NC 4.0) ( https://creativecommons.org/licenses/by-nc/4.0/ )
Abstract

 信用评级是信贷业务的核心,为此各种统计建模方法应运而生. 随着大数据时代的到来,收集数据的范围显著扩大,可用于信用评级的特征数量也随之增加.这些带来了特征冗余的风险, 因此特征选择是建模过程中至关重要的一步.本文提出了一种两阶段信用评分建模方法.首先对全部特征进行基于Mean Variance的独立性检验, 进行初步筛选, 然后采用基于LightGBM的分类模型得到最终的违约概率预测模型.此外, 我们构建了一个虚拟特征,用于检测模型中是否仍然存在冗余特征.最后,将该方法应用于实际的在线信贷业务数据,以评估该方法的有效性。

Keywords
信用评级
特征冗余
独立性检验
LightGBM
References

[1] KE G, MENG Q, FINLEY T, et al. LightGBM: A highly efficient gradient 
boosting decision tree[J]. Advances in Neural Information Processing Systems, 2017, 
30: 3146-3154.
 [2] BANASIK J, CROOK J, THOMAS L. Sample selection bias in credit scoring 
models[J]. Journal of the Operational Research Society, 2003, 54(8): 822-832.
 [3] CHEN G G, ÅSTEBRO T. Bound and collapse Bayesian reject inference for credit 
scoring[J]. Journal of the Operational Research Society, 2012, 63(10): 1374-1387.
 [4] FENG X, XIAO Z, ZHONG B, et al. Dynamic ensemble classification for credit 
scoring using soft probability[J]. Applied Soft Computing, 2018, 65: 139-151.
 [5] DIRICK L, CLAESKENS G, JERUSALEM G, et al. Macro-economic factors in 
credit risk calculations: including time-varying covariates in mixture cure models[J]. 
Journal of Business & Economic Statistics, 2019, 37(1): 40-53.
 [6] FANG F, CHEN Y. A new approach for credit scoring by directly maximizing the 
Kolmogorov-Smirnov statistic[J]. Computational Statistics & Data Analysis, 2019, 
133: 180-194.
 [7] SHEN F, ZHAO X, KOU G. Three-stage reject inference learning framework for 
credit scoring using unsupervised transfer learning and three-way decision theory[J]. 
Decision Support Systems, 2020, 137: 113366.
 [8] KOZODOI N, JACOB J, LESSMANN S. Fairness in credit scoring: Assessment, 
implementation and profit implications[J]. European Journal of Operational Research, 
2022, 297(3): 1083-1094.
 [9] MUSHAVA J, MURRAY M. A novel XGBoost extension for credit scoring class
imbalanced data combining a generalized extreme value link and a modified focal loss 
function[J]. Expert Systems with Applications, 2022, 202: 117233.
 [10] HE H, ZHANG S, SHEN F, et al. A privacy-preserving decentralized credit 
scoring method based on multi-party information[J]. Decision Support Systems, 
2023, 166: 113910.
 [11] CHATTERJEE S, CORBAE D, NAKAJIMA M, et al. A quantitative theory of 
the credit score[J]. Econometrica, 2023, 91(5): 1803-1840.
 [12] TIBSHIRANI R. Regression shrinkage and selection via the lasso[J]. Journal of 
the Royal Statistical Society: Series B (Statistical Methodology), 1996, 58(1): 267
288.
 [13] FAN J, LI R. Variable selection via nonconcave penalized likelihood and its oracle 
properties[J]. Journal of the American Statistical Association, 2001, 96(456): 1348
1360.
 [14] CUI H, LI R, ZHONG W. Model-free feature screening for ultrahigh dimensional 
discriminant analysis[J]. Journal of the American Statistical Association, 2015, 
110(510): 630-641.
 [15] 陈秋华, 杨慧荣, 崔恒建. 变量筛选后的个人信贷评分模型与统计学习[J]. 数理统
计与管理, 2020, 39(2): 13.
 [16] 王冠鹏, 秦双燕, 崔恒建. 员工流失的影响因素分析与预测[J]. 系统科学与数学, 
2022, 42(6): 1616-1632.1

Share
Back to top