《電子技術(shù)應(yīng)用》
您所在的位置:首頁(yè) > 其他 > 设计应用 > 基于PCNN的工业制造领域质量文本实体关系抽取方法
基于PCNN的工业制造领域质量文本实体关系抽取方法
信息技术与网络安全
张 彤1,宋明艳1,王 俊1,2,白 洋1
(1.北京京航计算通讯研究所,北京100071;   2.哈尔滨工业大学 经济与管理学院,黑龙江 哈尔滨150006)
摘要: 对汽车、机械等工业制造行业的质量报告进行关系抽取,对于该行业质量知识图谱、质量问答系统等研究有着极为重要的意义。针对在工业制造领域的质量知识图谱构建过程中尚无公开数据集可用的情况,收集了质量文本并进行相应的专业标注,构建了工业制造领域质量知识图谱关系抽取专业数据集。基于该数据集利用分段卷积神经网络(Piecewise Convolutional Neural Network,PCNN)实现关系抽取,然后根据中文特性,提出了改进的PCNN模型(C-PCNN),以提升在中文语料中关系抽取的性能。在本文构建的数据集中,改进后模型的准确率、召回率以及F1值优于对比的PCNN和RNN模型,验证了该方法的可行性和有效性。该研究对从事制造行业的人员有一定的实际意义。
中圖分類號(hào): TP391
文獻(xiàn)標(biāo)識(shí)碼: ADOI: 10.19358/j.issn.2096-5133.2021.03.002
引用格式: 張彤,宋明艷,王俊,等。 基于PCNN的工業(yè)制造領(lǐng)域質(zhì)量文本實(shí)體關(guān)系抽取方法[J].信息技術(shù)與網(wǎng)絡(luò)安全,2021,40(3):8-13.
Entity relation extraction method for quality text of industrial manufacturing based on Piecewise Convolutional Neural Network
Zhang Tong1,Song Mingyan1,Wang Jun1,2,Bai Yang1
(1.Beijing Jinghang Research Institute of Computing and Communication,Beijing 100071,China; 2.School of Management,Harbin Institute of Technology,Harbin 150006,China)
Abstract: Relation extraction of quality reports in industrial manufacturing industries such as automobiles and machinery is of great significance to the research of quality knowledge graph and quality question answering system of the industry. Aiming at the situation that there is no public dataset available for relation extraction of quality reports in the industrial manufacturing field, this paper collects quality reports in the field of industrial manufacturing and makes corresponding professional labels to construct a professional dataset for relation extraction. Based on this dataset, Piecewise Convolutional Neural Network(PCNN) is used for relation extraction. To be more specific, then based on Chinese characteristics, an improved PCNN model(C-PCNN) based on chinese characteristics is proposed to improve the performance of relation extraction in chinese corpus. Experimental results on the constructed dataset show that the accuracy, recall, and F1 values of the C-PCNN are respectively better than PCNN and RNN, indicating the feasibility and effectiveness of the method. This research has practical significance for personnel engaged in the manufacturing industry.
Key words : industrial manufacturing;quality text;relation extraction;Piecewise Convolutional Neural Network

0 引言

汽車、機(jī)械等工業(yè)制造行業(yè)的產(chǎn)品是涉及多個(gè)技術(shù)領(lǐng)域的高精度、高可靠性產(chǎn)品,具有結(jié)構(gòu)復(fù)雜,生產(chǎn)周期長(zhǎng)、生產(chǎn)狀態(tài)多等特點(diǎn)[1]。隨著信息化時(shí)代的發(fā)展,在生產(chǎn)研制過程中產(chǎn)生的各類質(zhì)量數(shù)據(jù)日趨龐大,但由于現(xiàn)階段缺乏統(tǒng)一的數(shù)據(jù)管理,各類質(zhì)量信息散落在業(yè)務(wù)系統(tǒng)中,以電子或紙質(zhì)文檔方式存在,這些離散存儲(chǔ)的質(zhì)量信息包含各類質(zhì)量問題的原因、問題部件、采取措施等關(guān)鍵信息。如何從這些離散存儲(chǔ)的質(zhì)量信息中抽取出有效信息,為工業(yè)制造提供數(shù)據(jù)支撐,幫助相關(guān)人員有效監(jiān)督產(chǎn)品生產(chǎn)、快速解決質(zhì)量問題,構(gòu)成工業(yè)制造領(lǐng)域質(zhì)量管理的迫切需求。本文從質(zhì)量文本出發(fā),利用關(guān)系抽取技術(shù)挖掘文本中實(shí)體間存在的語義關(guān)系,為后續(xù)構(gòu)建質(zhì)量知識(shí)圖譜、質(zhì)量問答系統(tǒng)奠定堅(jiān)實(shí)基礎(chǔ)。




本文詳細(xì)內(nèi)容請(qǐng)下載:http://m.ihrv.cn/resource/share/2000003422




作者信息:

張  彤1,宋明艷1,王  俊1,2,白  洋1

(1.北京京航計(jì)算通訊研究所,北京100071;

2.哈爾濱工業(yè)大學(xué) 經(jīng)濟(jì)與管理學(xué)院,黑龍江 哈爾濱150006)


此內(nèi)容為AET網(wǎng)站原創(chuàng),未經(jīng)授權(quán)禁止轉(zhuǎn)載。