Optimized Fusion of Audio-Video Archival Data in Mobile Environments via the DeepSeek Multimodal Algorithm

Authors

DOI:

https://doi.org/10.3991/ijim.v19i15.57101

Keywords:

mobile environment, audio-video archival data fusion, DeepSeek multimodal algorithm, soft-threshold attention module, polarized loss function

Abstract


With the widespread adoption of mobile intelligent devices and the rapid development of 5G technology, audio-video archival data have been increasingly utilized in mobile scenarios, requiring more advanced multimodal fusion capabilities to support intelligent and mobile archival management. However, limitations in mobile device computing power, unstable network conditions, and challenges posed by the spatiotemporal asynchrony and semantic heterogeneity of audio and video data have rendered traditional fusion approaches inadequate in balancing accuracy with low-power, real-time processing on mobile platforms. Existing models based on conventional machine learning or early deep learning architectures have limited the ability to address the fragmentation and asynchrony of multimodal data in mobile environments. Therefore, a DeepSeek-based multimodal algorithm tailored for mobile contexts was proposed in this study. The approach focuses on the design of a soft-threshold attention module, the formulation of a polarized loss function, and the construction of a lightweight fusion network architecture. Dynamic cross-modal feature weighting was employed to optimize computational resource allocation, while the modified loss function enhanced semantic coupling. The network design was adapted to the constraints of mobile hardware to enable efficient and real-time fusion of multimodal data. The outcomes of this study provide a lightweight solution for processing complex audio-video archival data on mobile terminals, offering significant implications for improving archival resource utilization and promoting the integration of archival management with mobile computing technologies.

Downloads

Published

2025-08-13

How to Cite

Bai, R. (2025). Optimized Fusion of Audio-Video Archival Data in Mobile Environments via the DeepSeek Multimodal Algorithm. International Journal of Interactive Mobile Technologies (iJIM), 19(15), pp. 26–40. https://doi.org/10.3991/ijim.v19i15.57101

Issue

Section

Papers