Home

  Editors

  Ethics

  Submission

  Volumes

  Indexing

  Copyright

  Fees

  Subscription

  Publisher

  Support

  EPPM

 

Journal of Engineering, Project, and Production Management, 2026, 16(6), 2025-185

 

A Deep Feature Fusion Framework for Accurate and Efficient Cross-Modal Retrieval in Digital Libraries

 

Xiaoyu Sun1, Wenjie Meng2, and Xuesong Zhang3

1 Associate Researcher, Library of the China University of Petroleum (East China), Qingdao, 266580, China, E-mail: sunxiaoyu0818@126.com (corresponding author)
2 Librarian, Library of the China University of Petroleum (East China), Qingdao, 266580, China
3
Associate Researcher, Library of the China University of Petroleum (East China), Qingdao, 266580, China

 

Project Management

 

Received August 27, 2025; revised December 21, 2025; June 5, 2026; accepted June 9, 2026

 

Available online June 17, 2026

 

Abstract:  To address the multi-modal resource retrieval needs of digital libraries, archives, and knowledge bases, this study proposes a Feature Fusion Cross Modal Hashing (FFCMH) model. It innovatively constructs a specialized dataset (multi-level filtering + cross modal denoising), employs autoencoders for feature fusion, and enhances image-text extraction through semantic segmentation, thereby supporting efficient retrieval. Experimental findings reveal that the proposed technology outperforms existing mainstream models, such as Locality-Sensitive Hashing, Semantic Topic Multi-modal Hashing (STMH), and Deep Cross Modal Hashing (DCMH), across metrics including recall rate and average precision on professional datasets. For instance, on the Flickr-25k dataset, the proposed technology achieves a maximum recall rate of 95.5% for Image-To-Text Cross Modal (I2TCM) retrieval and 86.7% for Text-To-Image Cross Modal (T2ICM) retrieval. Furthermore, the proposed technology exhibits significant advantages in retrieval accuracy and efficiency on self-made datasets, with average precision values of 0.948 for I2TCM and 0.938 for T2ICM, while requiring significantly less retrieval time than other models. This technology provides technical support for libraries to achieve efficient and precise cross modal resource retrieval.

 

Keywords:  Library management, cross-modal hashing, feature fusion, semantic segmentation, resource retrieval.

Copyright © Journal of Engineering, Project, and Production Management (EPPM-Journal).

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

Requests for reprints and permissions at eppm.journal@gmail.com.

Citation: Sun, X., Meng, W., and Zhang, X. (2026). A Deep Feature Fusion Framework for Accurate and Efficient Cross-Modal Retrieval in Digital Libraries. Journal of Engineering, Project, and Production Management, 16(6), 2025-185.

DOI: 10.32738/JEPPM-2025-185

Full Text


Copyright © EPPM-Journal. All rights reserved.