Optimized Fusion of Audio-Video Archival Data in Mobile Environments via the DeepSeek Multimodal Algorithm
DOI:
https://doi.org/10.3991/ijim.v19i15.57101Keywords:
mobile environment, audio-video archival data fusion, DeepSeek multimodal algorithm, soft-threshold attention module, polarized loss functionAbstract
With the widespread adoption of mobile intelligent devices and the rapid development of 5G technology, audio-video archival data have been increasingly utilized in mobile scenarios, requiring more advanced multimodal fusion capabilities to support intelligent and mobile archival management. However, limitations in mobile device computing power, unstable network conditions, and challenges posed by the spatiotemporal asynchrony and semantic heterogeneity of audio and video data have rendered traditional fusion approaches inadequate in balancing accuracy with low-power, real-time processing on mobile platforms. Existing models based on conventional machine learning or early deep learning architectures have limited the ability to address the fragmentation and asynchrony of multimodal data in mobile environments. Therefore, a DeepSeek-based multimodal algorithm tailored for mobile contexts was proposed in this study. The approach focuses on the design of a soft-threshold attention module, the formulation of a polarized loss function, and the construction of a lightweight fusion network architecture. Dynamic cross-modal feature weighting was employed to optimize computational resource allocation, while the modified loss function enhanced semantic coupling. The network design was adapted to the constraints of mobile hardware to enable efficient and real-time fusion of multimodal data. The outcomes of this study provide a lightweight solution for processing complex audio-video archival data on mobile terminals, offering significant implications for improving archival resource utilization and promoting the integration of archival management with mobile computing technologies.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Ruhua Bai

This work is licensed under a Creative Commons Attribution 4.0 International License.

