123,123,123

基于遺傳算法和LightGBM的網(wǎng)絡(luò)安全態(tài)勢(shì)感知模型

網(wǎng)絡(luò)安全與數(shù)據(jù)治理

胡銳，徐芳，熊郁峰，熊洲宇，陳敏

江西省煙草公司吉安市公司

摘要： 針對(duì)傳統(tǒng)煙草工業(yè)系統(tǒng)中的網(wǎng)絡(luò)流量異常檢測(cè)方法存在的特征間聯(lián)系和上下文信息丟失等問(wèn)題，提出了一種基于遺傳算法改進(jìn)的LightGBM模型，此模型能夠使得模型避免陷入局部最優(yōu)情況。首先通過(guò)計(jì)算構(gòu)建樹模型對(duì)數(shù)據(jù)降維，從高維數(shù)據(jù)中挖掘出對(duì)于檢測(cè)效果影響重要的關(guān)鍵特征信息，并使用提出的模型對(duì)這些關(guān)鍵特征信息進(jìn)行分析。為了評(píng)估模型的有效性與優(yōu)越性，使用準(zhǔn)確率和損失進(jìn)行模型評(píng)價(jià)，并與其他網(wǎng)絡(luò)流量異常檢測(cè)模型Tabular model、TabNet、LightGBM、XGBoost進(jìn)行對(duì)比。使用公開數(shù)據(jù)集 CIC.IDS.2018 進(jìn)行實(shí)驗(yàn)分析。結(jié)果表明，在高特征的網(wǎng)絡(luò)安全態(tài)勢(shì)感知下，多分類和二分類的識(shí)別準(zhǔn)確率分別達(dá)99.43%和99.87%，在低特征情況下，多分類和二分類的識(shí)別準(zhǔn)確率分別達(dá)98.73%和99.39%，具有較高準(zhǔn)確率以及良好的靈活性和魯棒性。

關(guān)鍵詞： 異常檢測(cè) 機(jī)器學(xué)習(xí) 遺傳算法 LightGBM

中圖分類號(hào)：TP393.0文獻(xiàn)標(biāo)識(shí)碼：ADOI:10.19358/j.issn.2097-1788.2024.03.003
引用格式：胡銳，徐芳，熊郁峰，等.基于遺傳算法和LightGBM的網(wǎng)絡(luò)安全態(tài)勢(shì)感知模型［J］.網(wǎng)絡(luò)安全與數(shù)據(jù)治理，2024，43（3）：14-20.

Network traffic anomaly identification and detection based on genetic algorithm and LightGBM

Hu Rui，Xu Fang，Xiong Yufeng，Xiong Zhouyu，Chen Min

Jiangxi Tobacco Company Ji′an City Company

Abstract： This study proposes an improved LightGBM model based on genetic algorithm to avoid problems such as the connection between features and the loss of contextual information in the network traffic anomaly detection method in traditional tobacco industry systems. This model can avoid the model falling into local optimal situations. First, the data dimensionality is reduced by calculating and constructing a tree model, and key feature information that is important to the detection effect is mined from high dimensional data, and the proposed model is used to analyze this key feature information. To evaluate the effectiveness and superiority of the model, this paper uses accuracy and loss to evaluate the model and compares it with other network traffic anomaly detection models Tabular model, TabNet, LightGBM, and XGBoost. Experimental analysis was conducted using the public data set CIC.IDS.2018. The results show that under high-feature network security situational awareness, the recognition accuracy of multi class and two-class classification reaches 99.43% and 99.87% respectively. In the case of low features, the multi-class recognition accuracy is 99.43%. The recognition accuracy of classification and binary classification reaches 98.73% and 99.39% respectively, which has high accuracy and good flexibility and robustness.

Key words : anomaly detection; machine learning; genetic algorithm; LightGBM

引言

網(wǎng)絡(luò)給諸多行業(yè)發(fā)展帶來(lái)了便利，但因網(wǎng)絡(luò)而導(dǎo)致的問(wèn)題也日漸顯著，相繼出現(xiàn)了因網(wǎng)絡(luò)信息保護(hù)不利而導(dǎo)致的信息泄露、網(wǎng)絡(luò)詐騙、網(wǎng)絡(luò)監(jiān)聽等事件［1］。人工智能技術(shù)是網(wǎng)絡(luò)安全技術(shù)難題的重要解決手段，越來(lái)越多的研究著重于基于人工智能構(gòu)建網(wǎng)絡(luò)態(tài)勢(shì)感知模型［2］。應(yīng)對(duì)網(wǎng)絡(luò)攻擊的研究成為熱門［3-4］，研究人員逐漸使用網(wǎng)絡(luò)安全態(tài)勢(shì)感知代替原有的被動(dòng)防御措施，能夠提前預(yù)測(cè)和發(fā)現(xiàn)潛藏的網(wǎng)絡(luò)攻擊。原始的網(wǎng)絡(luò)異常流量檢測(cè)模型中通常使用統(tǒng)計(jì)分析［5］等方法，由于是通過(guò)已有信息來(lái)進(jìn)行防范，往往因?yàn)轭A(yù)測(cè)效果差而達(dá)不到防范新型網(wǎng)絡(luò)攻擊的效果。

本文詳細(xì)內(nèi)容請(qǐng)下載：

http://m.ihrv.cn/resource/share/2000005929

作者信息：

胡銳，徐芳，熊郁峰，熊洲宇，陳敏

江西省煙草公司吉安市公司，江西吉安343009

雜志訂閱.jpg

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容