123,123

智能网卡加速Ceph存储的性能研究

电子技术应用

刘宝琴，罗向征，林茂，王钦雅，兰丽莎

迈普通信技术股份有限公司

摘要： 聚焦Ceph存储系统对象存储设备（Object Storage Device， OSD）架构线程锁竞争机制所导致的多核并行扩展能力受限问题，针对下一代Crimson-OSD架构与智能网卡协同优化技术开展研究，提出分层协同优化框架。相关研究表明，采用智能网卡协同优化，RDMA网络卸载降低CPU占用率达到70%，异构计算引擎实现纠删码硬件加速提升数据恢复速度达到4.84倍。研究成果为分布式存储系统的硬件加速提供相关理论依据与关键技术参考，对高性能计算和云边端融合等数据密集型场景的存储系统优化具有指导意义。

關(guān)鍵詞： 智能网卡 Ceph存储系统性能优化硬件加速分布式存储系统

中圖分類號(hào)：TN915.05 文獻(xiàn)標(biāo)志碼：A DOI: 10.16157/j.issn.0258-7998.256678
中文引用格式： 劉寶琴，羅向征，林茂，等. 智能網(wǎng)卡加速Ceph存儲(chǔ)的性能研究[J]. 電子技術(shù)應(yīng)用，2025，51(12)：14-19.
英文引用格式： Liu Baoqin，Luo Xiangzheng，Lin Mao，et al. Research on accelerating Ceph storage performance with SmartNICs[J]. Application of Electronic Technique，2025，51(12)：14-19.

Research on accelerating Ceph storage performance with SmartNICs

Liu Baoqin，Luo Xiangzheng，Lin Mao，Wang Qinya，Lan Lisha

Maipu Communication Technology Co.， Ltd.

Abstract： This paper focuses on the issue of limited multi-core parallel scalability caused by thread lock contention mechanisms in the architecture of the Ceph storage system's Object Storage Device (OSD). It conducts research on collaborative optimization technologies between the next-generation Crimson-OSD architecture and SmartNICs, proposing a hierarchical cooperative optimization framework. Related studies demonstrate that employing SmartNIC-based cooperative optimization achieves a 70% reduction in CPU utilization through RDMA network offloading, while heterogeneous computing engines enable hardware acceleration for erasure coding, improving data recovery speed by 4.84 times. The research outcomes provide theoretical foundations and key technical references for hardware acceleration in distributed storage systems, offering guidance for optimizing storage systems in data-intensive scenarios such as high-performance computing and cloud-edge-end integration.

Key words : SmartNIC；Ceph storage system；performance optimization；hardware acceleration；distributed storage system

引言

以AI訓(xùn)練、HPC、邊緣計(jì)算為代表的數(shù)據(jù)密集型應(yīng)用爆發(fā)式增長(zhǎng)對(duì)存儲(chǔ)系統(tǒng)的性能與彈性提出前所未有的挑戰(zhàn)。Ceph憑借高可用性與可擴(kuò)展性優(yōu)勢(shì)在云數(shù)據(jù)中心得到廣泛應(yīng)用，但其傳統(tǒng)OSD架構(gòu)在多核場(chǎng)景下因線程鎖競(jìng)爭(zhēng)與跨核通信開銷，導(dǎo)致處理器（CPU）利用率偏低，難以適配NVMe SSD等高性能硬件。Ceph社區(qū)為此重構(gòu)了Crimson-OSD架構(gòu)，通過Shared-Nothing設(shè)計(jì)與異步流水線模型，優(yōu)化多核擴(kuò)展性。實(shí)際測(cè)試表明： 8線程配置下，4K隨機(jī)讀IOPS性能達(dá)到311k，隨著核數(shù)增長(zhǎng)，性能得到進(jìn)一步提升，驗(yàn)證了架構(gòu)重構(gòu)的有效性。盡管Crimson-OSD架構(gòu)設(shè)計(jì)取得了長(zhǎng)足進(jìn)步，但在借助智能網(wǎng)卡可編程加速能力來開展協(xié)同優(yōu)化方面的研究仍顯不足。

針對(duì)Crimson-OSD 架構(gòu)特點(diǎn)與性能瓶頸分析的基礎(chǔ)上，本文提出基于智能網(wǎng)卡的分層協(xié)同優(yōu)化框架，其核心內(nèi)容包括兩個(gè)方面，首先是建立關(guān)鍵參數(shù)性能敏感性模型，對(duì)Crimson-OSD多核擴(kuò)展能力進(jìn)行量化分析；其次設(shè)計(jì)分層協(xié)同優(yōu)化框架，突破CPU算力對(duì)存儲(chǔ)系統(tǒng)性能的制約。進(jìn)一步對(duì)存算一體架構(gòu)與AI賦能動(dòng)態(tài)管理前沿方向進(jìn)行了初步探討。

本文詳細(xì)內(nèi)容請(qǐng)下載：

http://m.ihrv.cn/resource/share/2000006870

作者信息：

劉寶琴，羅向征，林茂，王欽雅，蘭麗莎

（邁普通信技術(shù)股份有限公司，四川成都 610094）

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容