一种基于频率与回归系数相结合的自举柔性收缩变量选择方法*
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

中图分类号: TH741文献标识码: A国家标准学科分类代码: 15025

基金项目:

*基金项目:国家重点研发计划(2016YFF0102805)项目资助


Bootstrapping soft shrinkage variable selection method based on the combination of frequency and regression coefficient
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    摘要:针对傅里叶变换红外光谱仪获取的谱线数量庞大,直接选用全部谱线进行多元线性回归易导致过拟合、稳定性差、分析周期长等问题,提出了一种基于频率与回归系数相结合的自举柔性收缩变量选择方法。该算法以变量的权重作为选择的依据,在每次迭代过程中,根据变量的回归系数与频率计算变量的权重,通过加权自举采样技术实现对变量的柔性收缩。应用玉米红外光谱集对该方法进行了验证,在玉米油数据集中,其预测均方根误差(RMSEP)与相关系数(Rp)分别为0020 2和0976 5,变量数目由原始的700个减少到13个;在玉米蛋白质数据集中,RMSEP与Rp分别为0027 9和0996 8,变量数目由原始的700个减少到16个。结果表明,提出的变量选择算法选择的变量少而精, 具有实际的应用价值。

    Abstract:

    Abstract:Aiming at the problems that the spectral lines obtained using Fourier transform infrared spectrometer are enormous, and directly using all the spectral lines to perform multiple linear regression easily leads to overfitting, poor stability and long analysis period. In this paper, a bootstrap soft shrinkage variable selection method based on the combination of frequency and regression coefficient is proposed. This method selects the variables based on the weight of the variables; in each iterative process, the new weight of the variable is calculated according to the regression coefficient and frequency of the variable, and the soft shrinkage of the variables is realized through weighted bootstrap sampling technology. The method was verified using the infrared spectrum datasets of corn. On the corn oil dataset, the root mean square error of prediction (RMSEP) and correlation coefficients (Rp) are 0020 2 and 0976 5, respectively, the number of variables is reduced from the original 700 to 13. On the corn protein dataset, the RMSEP and Rp are 0027 9 and 0996 8, respectively, the number of variables is reduced from the original 700 to 16. The result shows that the proposed variable selection algorithm can select fewer and more precise variables, and has practical application value.

    参考文献
    相似文献
    引证文献
引用本文

张峰,汤晓君,仝昂鑫,王斌,王经纬.一种基于频率与回归系数相结合的自举柔性收缩变量选择方法*[J].仪器仪表学报,2020,41(1):64-70

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2022-01-11
  • 出版日期: