基于ReliefF和最大相关最小冗余的多标记特征选择
摘要:
针对现有的特征选择模型未涉及特征和标记集之间的相关度,造成分类精度偏低等情况,提出了基于ReliefF和最大相关最小冗余(maximum Relevance and Minimum Redundancy,mRMR)的多标记特征选择.首先,运用互信息计算每个标记和标记集之间的相关度,使用每项相关度占其相关度之和的比例设计了标记权重,由此构建了特征和标记集间的相关度,初选与标记集相关度高的特征;其次,计算对象在特征上的距离,构建了新的特征权值更新公式,基于标记权重改进多标记ReliefF模型.然后,基于互信息和标记权重构建了最大相关性,设计了最小冗余性及其新的最大相关最小冗余评价准则,并将其应用于多标记特征选择,进一步剔除冗余特征;最后,设计了一种基于ReliefF和最大相关最小冗余的多标记特征选择算法,有效提高了多标记分类性能.在8个多标记数据集上测试所提算法的平均分类精度、覆盖率、汉明损失、1错误率和排序损失,实验结果证明了该算法的有效性.
The correlation between feature and label set is not deeply considered by existing multilabel feature selection models,which results in low classification accuracy.To address the issues,this paper proposed a multilabel feature selection method using ReliefF and maximum Relevance and Minimum Redundancy(mRMR).Firstly,based on the mutual information,the correlation degree between the label and the label-set was defined.A new label weighting was constructed by calculating the proportion of the correlation degree to the sum of the correlation degrees between all labels and the label set.Thus the relationship calculation between the feature and the label set was designed to select the feature subsets that are highly correlated with the label set.Secondly,by calculating the distance of the samples on the feature,a new feature weighting update formula was developed to improve the multilabel ReliefF model based on the label weighting.Thirdly,based on mutual information and the label weighting,the maximum correlation was constructed,the minimum redundancy and new maximum correlation and minimum redundancy evaluation criterion was constructed,which could be applied to multilabel feature selection to further eliminate redundancy features.Finally,a multilabel feature selection algorithm using ReliefF and mRMR was designed to effectively improve the performance of multilabel classification.The experiment was conducted on eight multilabel datasets to test the Average precision,Coverage rate,Hamming Loss,One Error rate and Ranking Loss of the proposed algorithm.The experimental results show that this presented algorithm is effective.
作者:
孙林 徐枫 李硕 王振
Sun Lin;Xu Feng;Li Shuo;Wang Zhen(College of Computer and Information Engineering,Henan Normal University,Xinxiang 453007,China)
机构地区:
betway官方app 计算机与信息工程学院
出处:
《betway官方app 学报:自然科学版》 CAS 北大核心 2023年第6期21-29,F0002,共10页
Journal of Henan Normal University(Natural Science Edition)
基金:
国家自然科学基金(62076089 61976082).
关键词:
多标记学习 特征选择 标记权重 ReliefF 最大相关最小冗余
multilabel learning feature selection label weighting ReliefF mRMR
分类号:
TP181 [自动化与计算机技术—控制理论与控制工程]