基于Stacking特征增强多粒度联级Logistic的个人信用评估
摘要:
主要针对广受关注的P2P网贷信用评估问题,利用机器学习方法提高申请人网贷违约预测准确率,研究出基于Stacking特征增强多粒度联级Logistic方法及其应用.所提分类器是一种混合模型,结合了Stacking集成学习和联级Logistic学习的思想.首先,通过网格搜索技术分别建立XGBoost,Catboost,LightGBM,AdaBoost以及Gradient Boosting模型,并筛选出适合的基评估器作为Stacking集成的初级学习器,logistic模型作为次级学习器,构建基于Stacking的多粒度扫描器,生成预测结果作为元特征,拼接成新特征数据.其次,通过新特征数据以及元特征在每级Logistic上的特征增强建立联级Logistic Regression模型,并且与现有的单一集成学习器和各基评估器在3个不同的P2P网贷信用评估数据集上进行对比.实验结果表明,通过AUC、准确率等指标对其进行评价,相比于各基评估器以及其他单一集成分类器,基于Stacking增强多粒度联级Logistic模型有较高的准确率,预测效果更优.
Mainly aimed at the widely concerned P2P online loan credit evaluation problem,the machine learning method was used to improve the accuracy of the applicant's online loan default prediction,and the enhanced multi-granularity cascade logistic method based on the Stacking feature and its application were studied.The proposed classifier is a hybrid model,which combines the ideas of Stacking ensemble learning and cascade logistic learning.First,XGBoost,Catboost,LightGBM,AdaBoost and Gradient Boosting models are established through grid search technology,and the appropriate base evaluator as the primary learner of Stacking ensemble and the logistic model as the secondary learner are selected to build a Multi-grained Scanner based on Stacking,generate prediction results as meta-features,and to stitch into new feature data.Secondly,the new feature data and the feature enhancement of meta-features on each level of Logistic are used to establish the cascade Logistic regression model,and compare constructed model with the existing single integrated learner and each base evaluator on three different P2P network credit evaluation data sets.The experimental results show that compared with each base evaluator and other single integrated classifiers,the multi-grained cascade logistic model based on Stacking has higher accuracy and better prediction effect when evaluated by AUC,accuracy and other indicators.
作者:
侯天宝 王爱银
Hou Tianbao;Wang Aiyin(School of Statistics and Data Science,Xinjiang University of Finance&Economics,Urumqi 830012,China)
机构地区:
新疆财经大学统计与数据科学学院
出处:
《betway官方app 学报:自然科学版》 CAS 北大核心 2023年第3期111-122,共12页
Journal of Henan Normal University(Natural Science Edition)
基金国家社科基金(18BJL072).
关键词:
个人信用 特征增强 Stacking集成 多粒度扫描 联级Logistic模型
personal credit feature enhancement Stacking ensemble multi-grained scanning cascade Logistic model
分类号:
TP393 [自动化与计算机技术—计算机应用技术]