文章摘要
简川霞,叶荣,林浩,贺鑫,杜美剑.非均衡训练集过采样的印刷套准识别方法[J].包装工程,2020,41(21):251-260.
JIAN Chuan-xia,YE Rong,LIN Hao,HE Xin,DU Mei-jian.Printing Registration Recognition Method Based on Oversampling Imbalanced Training Dataset[J].Packaging Engineering,2020,41(21):251-260.
非均衡训练集过采样的印刷套准识别方法
Printing Registration Recognition Method Based on Oversampling Imbalanced Training Dataset
投稿时间:2020-05-08  
DOI:10.19554/j.cnki.1001-3563.2020.21.037
中文关键词: 非均衡数据  印刷套准  灰度行程矩阵  过采样
英文关键词: imbalanced data  printing registration  GLRLM  oversampling
基金项目:广东省信息物理融合系统重点实验室项目(2016B030301008);广东工业大学青年基金重点项目(17QNZD001);2019—2020年大学生创新创业训练项目(xj201911845014,201911845008,xj202011845015,xj202011845016)
作者单位
简川霞 广东工业大学 机电工程学院广州 510006 
叶荣 广东工业大学 机电工程学院广州 510006 
林浩 广东工业大学 机电工程学院广州 510006 
贺鑫 广东工业大学 机电工程学院广州 510006 
杜美剑 广东工业大学 机电工程学院广州 510006 
摘要点击次数:
全文下载次数:
中文摘要:
      目的 针对印刷标志图像训练数据集非均衡性导致印刷标志图像中少类数据套准状态识别准确率低的问题,提出改进的SMOTE训练集过采样方法,以提高少类数据的识别准确率。方法 提取印刷标志图像灰度行程矩阵的纹理特征,组成多维的模型输入特征数据。基于少类样本的邻域信息,得到少类样本的过采样参数。对少类样本采取不同的过采样策略,实现训练集样本的均衡。使用均衡的训练集建立支持向量机模型,实现对印刷套准状态的识别。结果 实验结果表明,文中方法在不同非均衡印刷数据集上,获得的平均分类准确率几何平均数Gmean为0.8507,召回率Re为0.7192,ROC曲线下面积A为0.8549。结论 文中方法在不同非均衡印刷套准数据集上的分类性能要优于实验中的SMOTE,IS和SVM等方法。
英文摘要:
      The work aims to propose an improved SMOTE oversampling method to deal with the minority class low data registration recognition accuracy of printing mark images caused by the imbalanced training dataset so as to improve the recognition accuracy of data. The texture features were extracted from the gray-level run-length matrix (GLRLM) of the printing mark images to form multi-dimensional feature data as the input vectors of the model. The oversampling parameter of the minority class was computed based on the neighborhood information of the minority class. Different oversampling strategies were implemented for the minority class. An unbalanced training dataset was learned to construct a support vector machine (SVM) model to realize the printing registration status recognition.The experimental results showed that, in terms of different imbalanced printing datasets, the method proposed in this paper can obtain the values of three evaluation indexes, geometric mean of average classification accuracy Gmean=0.8507, recall rate Re=0.7192 and area under the curve A=0.8549.The proposed method outperforms the SMOTE, the IS and the SVM in the experiment in classifying the different imbalanced printing registration datasets.
查看全文   查看/发表评论  下载PDF阅读器
关闭

关于我们 | 联系我们 | 投诉建议 | 隐私保护 | 用户协议

您是第24861999位访问者    渝ICP备15012534号-2

版权所有:《包装工程》编辑部 2014 All Rights Reserved

邮编:400039 电话:023-68795652 Email: designartj@126.com

    

渝公网安备 50010702501716号