我国海量的船舶行业数据是我国实现船舶产业升级和增强国际影响力的核心资源,如何对其进行高效利用是个重要问题。基于大模型的检索增强生成技术以其强大的语言理解和检索分析能力成为数据分析的主流,被造船强国重点关注。但在海量数据的压力下,其暴露出检索退化、理解不足、存储瓶颈和数据安全等难题,导致数据利用率低下。故本文结合DeepSeek R1、检索增强生成模型、联邦学习、知识图谱、数据加密等技术,提出一种“多级式检索、增强式理解、去中心化式存储、安全式传输”的船舶数据多级检索匹配增强生成系统,支撑船舶在设计、建造及运营中海量复杂数据检索分析和问题解决方案生成,推进我国船舶产业升级。
Massive marine data is the core resource for the China ship industry upgrade and enhanced international influence. It is an important problem to use them appropriately. Retrieval-augmented generation technology, which based on large language model, with its strong language understanding and data retrieval analysis capabilities, has received significant attention form major shipbuilding powers. However, under the pressure of massive data, the technology has gradually exposed issues such as retrieval degradation, insufficient understanding, storage bottlenecks and data security, which causes low efficiency. Therefore, combined with technologies such as DeepSeek R1, the retrieval-augmented generation model, federated learning, knowledge graph, data encryption, etc., we proposes a multi-level retrieval-augmented understanding decentralized storage security transmission intelligent data retrieval and analysis system to support complex data retrieval and analysis and problem solution generation in the design, construction and operation of ships, and promote the industrial upgrading of China's shipbuilding industry.
2026,48(2): 198-205 收稿日期:2025-3-6
DOI:10.3404/j.issn.1672-7649.2026.02.031
分类号:U662.9
基金项目:船舶制造业数字化转型顶层设计研究(CBG01N23-01-01)
作者简介:张硕(1998-),男,硕士,工程师,研究方向为船舶数字化、人工智能
参考文献:
[1] 王焕杰, 魏铨, 刘祺, 等. 海洋装备数据安全共享方案[J]. 舰船科学技术, 2024, 46(22): 170-173.
WANG H J, WEI Q, LIU Q, et al. Marine equipment data security sharing solution[J]. Ship Science and Technology, 2024, 46(22): 170-173.
[2] 司聿宣. 基于电子知识库的船舶故障诊断技术及应用[J]. 舰船科学技术, 2024, 46(16): 153-157.
SI J X. Maritime fault diagnosis technology and application based on electronic knowledge[J]. Ship Science and Technology, 2024, 46(16): 153-157.
[3] 张山山. 基于区块链的舰船航行大数据共享安全认证研究[J]. 舰船科学技术, 2021, 43(18): 82-84.
[4] 陈家宾, 杨洪杰, 殷涛, 等. 船舶领域标准数字化应用实践[J]. 信息技术与标准化, 2024(8): 62-66.
CHEN J B, YANG H J, YIN T et al. Standard digitlization application practices in the shipbuilding industry[J]. Information Technology & Standardization, 2024(8): 62-66.
[5] 王兆杰, 于雷, 熊进辉, 等. 关于AI大模型技术赋能船舶领域的认识[J]. 智能科学与技术学报, 2024, 6(1): 33-40.
WANG Z J, YU L, XIONG J H, et al. A study on the empowerment effect of ai large model technology in the maritime domain[J]. Chinese Journal of Intelligent Science and Technology, 2024, 6(1): 33-40.
[6] 姚元杰, 龚毅光, 刘佳, 等. 基于深度学习的智能问答系统综述[J]. 计算机系统应用, 2023, 4(4): 1-15
YAO Y J, GONG Y G, LIU J, et al. Survey on intelligent question answering system based on deep learning[J]. Computer Systems & Applications, 2023, 4(4): 1-15
[7] 任海玉, 刘建平, 王健, 等. 基于大语言模型的智能问答系统研究综述[J/OL]. 计算机工程与应用, 1-24 [2025-02-19].
REN H Y, LIU J P, WANG J, et al. Research on intelligent question answering system based on large language model [J/OL]. Computer Engineering and Applications, 1-24[2025-02-19].
[8] 杨建行. 基于外部知识的船舶航行智能问答技术研究[D]. 哈尔滨: 哈尔滨工程大学, 2022 .
[9] 杜佳伟. 基于自然语言理解的船舶靠离泊知识问答系统设计与实现[D]. 哈尔滨: 哈尔滨工程大学, 2022.
[10] 唐晓晟, 程琳雅, 张春红, 等. 大语言模型在学科知识图谱自动化构建上的应用[J]. 北京邮电大学学报, 2024, 1(1): 125-136
TANG X S, CHENG L Y, ZHANG C H, et al. Application of large language models in automated construction of knowledge graphs for university subject domains[J]. Journal of Beijing University of Posts and Telecommunication, 2024, 1(1): 125-136
[11] 赵静, 汤文玉, 霍钰, 等. 大模型检索增强生成(RAG)技术浅析[J]. 中国信息化, 2024(10): 71-72.
ZHAO J, TANG W Y, HUO Y, et al. A brief analysis of retrieval-augmented generation technology for large models[J]. China Informatization, 2024(10): 71-72.
[12] LEWIS, P, PEREZ, E, PIKTUS, A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks [EB/OL]. (2021-04-12)[2025-02-01].
[13] 赵雪琴. 基于RAG的科技奖励知识库构建与应用研究[J]. 情报探索, 2024(11): 75-81.
ZHAO X Q. Research on construction and application of RAG-based technology awards knowledge base[J]. Information Research, 2024(11): 75-81.
[14] 高雅奇. 基于大语言模型和RAG技术的高校知识库智能问答系统构建与评价[J]. 电脑知识与技术, 2024, 20(29): 18-20.
GAO Y Q. Construction and evaluation of intelligent question answering system for university knowledge base based on large language models and RAG technology[J]. Computer Knowledge and Technology, 2024, 20(29): 18-20.
[15] Deepseek-AI, Guo D, Yang D J, et al. DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning[EB/OL]. (2025-01-22)[2025-04-28].
[16] XU K, ZHANG K, LI J, et al. CRP-RAG: A retrieval-augmented generation framework for supporting complex logical reasoning and knowledge planning[J]. Electronics, 2024, 14(1): 47-47.
[17] YEHUDA Y, MALKIEL I, BARKAN O, et al. InterrogateLLM: zero-resource hallucination detection in LLM-generated answears [EB/OL]. (2024-08-19)[2025-02-19].
[18] PARTH S, SALMAN A, ADITI T, et al. RAPTOR: recursive abstractive processing for tree-organized retrieval [EB/OL]. (2024-01-31)[2025-04-28]. https://arxiv.org/abs/2401.18059.
[19] PEZESHKPOUR P, HRUSCHKA E. Insight-RAG: enhancing LLMs with insight-driven augmentation[EB/OL]. (2025-03-31)[2025-04-28]. https://arxiv.org/abs/2504.00187.
[20] WU J, OUYANG L, LOWE R, et al. Recursively summarizing books with human feedback[EB/OL]. (2021-09-27)[2025-04-28]. https://arxiv.org/abs/2109.10862.
[21] WANG J, ZHAO W, TU X, et al. A novel dense retrieval framework for long document retrieval[J]. Frontiers of Computer Science, 2023, 17(4): 174609.
[22] ZHANG N, CHOUBEY P K, FABBRI A, et al. SIRERAG: indexing similar and related information for multihop reasoning[C]//International Conference on Learning Representations (ICLR), 2025.
[23] ZOU J R, FU D Q, CHEN S R, et al. GTR: Graph-table-RAG for cross-table question answering[EB/OL]. (2025-04-02)[2025-04-28]. https://arxiv.org/abs/2504.01346.
[24] CHEN W J, BAI T, SU J B, et al. KG-Retriever: efficient knowledge indexing for retrieval-augmented large language models[EB/OL]. (2024-12-07)[2025-04-28]. https://arxiv.org/pdf/2412.05547.
[25] FENG Y F, HU H, HOU X L, et al. Hyper-RAG: combating LLM hallucinations using hypergraph-driven retrieval-augmented generation[EB/OL]. (2025-03-30)[2025-04-28]. https://arxiv.org/abs/2504.08758.
[26] JADON A, PATIL A, KUMAR S. Enhancing domain-specific retrieval-augmented generation: synthetic data generation and evaluation using reasoning models[EB/OL]. (2025-02-21)[2025-04-28]. https://arxiv.org/pdf/2502.15854.
[27] TRAN H, YAO Z H, WANG J, et al. RARE: retrieval-augmented reasoning enhancement for large language models[EB/OL]. (2024-12-03)[2025-04-28]. https://arxiv.org/abs/2412.02830.
[28] LECU A, GROZA A, HAWIZY L, et al. Knowledge graph-driven retrieval-augmented generation: integrating Deepseek-R1 with weaviate for advanced chatbot applications[EB/OL]. (2025-02-16)[2025-04-28]. https://arxiv.org/pdf/2502.11108.
[29] YU J W, SATO H. DeRAG: Decentralized multi-source RAG system with optimized pyth network [C]//2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA), 2024.
[30] CHONG Z K, OSAKI H, NG B. LLM-Net: Democratizing LLMs-as-a-service through blockchain-based expert networks[EB/OL]. (2025-01-13)[2025-04-28]. https://arxiv.org/abs/2501.07288.
[31] HELMI T. Decentralizing AI memory: SHIMI, a semantic hierarchical memory index for scalable agent reasoning[EB/OL]. (2025-04-08)[2025-04-28]. https://arxiv.org/abs/2504.06135.
[32] 周乐鸿. 基于可搜索加密的通信网络数据安全检索[J]. 长江信息通信, 2025, 38(2): 146-148 .
ZHOU L H. Communication network data security retrieval based on searchable encryption[J]. Changjiang Information & Communications, 2025, 38(2): 146-148 .
[33] 孙僖泽, 周福才, 李宇溪, 等. 基于可搜索加密机制的数据库加密方案[J]. 计算机学报, 2021, 44(4): 708-918
SUN X Z, ZHOU F C, LI Y X, et al. A database encryption scheme based on searchable encryption[J]. Chinese Journal of Computers, 2021, 44(4): 708-918
[34] 刘欢, 邓伦治, 李滨瀚. 基于区块链的动态多用户可搜索加密方案[J]. 计算机应用研究, 2025, 43(3): 693-699
LIU H, ZHENG L Z, LI H H, et al. Blockchain-based dynamic multi-user searchable encryption scheme[J]. Application Research of Computers, 2025, 43(3): 693-699
[35] 杜小勇, 卢卫, 张峰. 大数据管理系统的历史、现状与未来[J]. 软件学报, 2019, 30(1): 127-141
[36] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30.
[37] 丁海兰, 祁坤钰. 基于TextRank算法和相似度的中文文本主题句自动提取[J/OL]. 吉林大学学报(工学版), 1-9[2025-02-19]
DING H L, QI Y K. Automatic extraction of Chinese text topic sentences based on TextRank algorithm and similarity[J]. Journal of Jilin University, 1-9[2025-02-19].
[38] LAN Z, CHEN M, GOODMAN S, et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations [EB/OL]. (2019-09-26)[2025-02-19].