Adjustment step size.
<div><p>In modern multimodal interaction design, integrating information from diverse modalities—such as speech, vision, and text—presents a significant challenge. These modalities differ in structure, timing, and data volume, often leading to mismatches, low computational efficiency, an...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , |
| Published: |
2025
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1852014634840621056 |
|---|---|
| author | Qingnan Ji (22662198) |
| author2 | Jinxia Wang (355966) Lixian Wang (465239) |
| author2_role | author author |
| author_facet | Qingnan Ji (22662198) Jinxia Wang (355966) Lixian Wang (465239) |
| author_role | author |
| dc.creator.none.fl_str_mv | Qingnan Ji (22662198) Jinxia Wang (355966) Lixian Wang (465239) |
| dc.date.none.fl_str_mv | 2025-11-21T18:26:37Z |
| dc.identifier.none.fl_str_mv | 10.1371/journal.pone.0326662.g003 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/figure/Adjustment_step_size_/30676947 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Biophysics Cancer Science Policy Space Science Biological Sciences not elsewhere classified Information Systems not elsewhere classified text &# 8212 suboptimal user experiences novel optimization strategy multimodal interaction scenarios modal correlation matrix keypoint detection ), intelligent interaction design iemocap )&# 8212 experimental results show dynamic weighting mechanism advancements contribute meaningfully multimodal information fusion textual data relevant structured matching models multimodal information integration integration efficiency increases correlation matrix constraint low computational efficiency integration process integrating information matching process data volume computational complexity xlink "> temporal alignment study introduces study aims significantly reduced significant challenge optimized kuhn often leading munkres algorithm modalities differ improved kuhn importance scores feature extraction experience ratings critical contribution computer collaboration broad applicability baseline method also incorporating ablation studies 7 %, 4 %. |
| dc.title.none.fl_str_mv | Adjustment step size. |
| dc.type.none.fl_str_mv | Image Figure info:eu-repo/semantics/publishedVersion image |
| description | <div><p>In modern multimodal interaction design, integrating information from diverse modalities—such as speech, vision, and text—presents a significant challenge. These modalities differ in structure, timing, and data volume, often leading to mismatches, low computational efficiency, and suboptimal user experiences during the integration process. This study aims to enhance both the efficiency and accuracy of multimodal information fusion. To achieve this, publicly available datasets—Carnegie Mellon University Multimodal Opinion Sentiment Intensity (CMU-MOSI) and Interactive Emotional Dyadic Motion Capture (IEMOCAP)—are employed to collect speech, visual, and textual data relevant to multimodal interaction scenarios. The data undergo preprocessing steps including noise reduction, feature extraction (e.g., Mel Frequency Cepstral Coefficients and keypoint detection), and temporal alignment. An improved Kuhn-Munkres algorithm is then proposed, extending the traditional bipartite graph matching model to support weighted multimodal matching. The algorithm dynamically adjusts weight coefficients based on the importance scores of each modality, while also incorporating a cross-modal correlation matrix as a constraint to improve the robustness of the matching process. The enhanced algorithm’s performance is validated through information matching efficiency tests and user interaction satisfaction surveys. Experimental results show that it improves multimodal information matching accuracy by 28.2% over the baseline method. Integration efficiency increases by 18.7%, and computational complexity is significantly reduced, with average computation time decreased by 15.4%. User satisfaction also improves, with a 19.5% increase in experience ratings. Ablation studies further confirm the critical contribution of both the dynamic weighting mechanism and the correlation matrix constraint to the overall performance. This study introduces a novel optimization strategy for multimodal information integration, offering substantial theoretical value and broad applicability in intelligent interaction design and human-computer collaboration. These advancements contribute meaningfully to the development of next-generation multimodal interaction systems.</p></div> |
| eu_rights_str_mv | openAccess |
| id | Manara_ca76b02eba6439bdf8a6c9d685e2e95f |
| identifier_str_mv | 10.1371/journal.pone.0326662.g003 |
| network_acronym_str | Manara |
| network_name_str | ManaraRepo |
| oai_identifier_str | oai:figshare.com:article/30676947 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Adjustment step size.Qingnan Ji (22662198)Jinxia Wang (355966)Lixian Wang (465239)BiophysicsCancerScience PolicySpace ScienceBiological Sciences not elsewhere classifiedInformation Systems not elsewhere classifiedtext &# 8212suboptimal user experiencesnovel optimization strategymultimodal interaction scenariosmodal correlation matrixkeypoint detection ),intelligent interaction designiemocap )&# 8212experimental results showdynamic weighting mechanismadvancements contribute meaningfullymultimodal information fusiontextual data relevantstructured matching modelsmultimodal information integrationintegration efficiency increasescorrelation matrix constraintlow computational efficiencyintegration processintegrating informationmatching processdata volumecomputational complexityxlink ">temporal alignmentstudy introducesstudy aimssignificantly reducedsignificant challengeoptimized kuhnoften leadingmunkres algorithmmodalities differimproved kuhnimportance scoresfeature extractionexperience ratingscritical contributioncomputer collaborationbroad applicabilitybaseline methodalso incorporatingablation studies7 %,4 %.<div><p>In modern multimodal interaction design, integrating information from diverse modalities—such as speech, vision, and text—presents a significant challenge. These modalities differ in structure, timing, and data volume, often leading to mismatches, low computational efficiency, and suboptimal user experiences during the integration process. This study aims to enhance both the efficiency and accuracy of multimodal information fusion. To achieve this, publicly available datasets—Carnegie Mellon University Multimodal Opinion Sentiment Intensity (CMU-MOSI) and Interactive Emotional Dyadic Motion Capture (IEMOCAP)—are employed to collect speech, visual, and textual data relevant to multimodal interaction scenarios. The data undergo preprocessing steps including noise reduction, feature extraction (e.g., Mel Frequency Cepstral Coefficients and keypoint detection), and temporal alignment. An improved Kuhn-Munkres algorithm is then proposed, extending the traditional bipartite graph matching model to support weighted multimodal matching. The algorithm dynamically adjusts weight coefficients based on the importance scores of each modality, while also incorporating a cross-modal correlation matrix as a constraint to improve the robustness of the matching process. The enhanced algorithm’s performance is validated through information matching efficiency tests and user interaction satisfaction surveys. Experimental results show that it improves multimodal information matching accuracy by 28.2% over the baseline method. Integration efficiency increases by 18.7%, and computational complexity is significantly reduced, with average computation time decreased by 15.4%. User satisfaction also improves, with a 19.5% increase in experience ratings. Ablation studies further confirm the critical contribution of both the dynamic weighting mechanism and the correlation matrix constraint to the overall performance. This study introduces a novel optimization strategy for multimodal information integration, offering substantial theoretical value and broad applicability in intelligent interaction design and human-computer collaboration. These advancements contribute meaningfully to the development of next-generation multimodal interaction systems.</p></div>2025-11-21T18:26:37ZImageFigureinfo:eu-repo/semantics/publishedVersionimage10.1371/journal.pone.0326662.g003https://figshare.com/articles/figure/Adjustment_step_size_/30676947CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/306769472025-11-21T18:26:37Z |
| spellingShingle | Adjustment step size. Qingnan Ji (22662198) Biophysics Cancer Science Policy Space Science Biological Sciences not elsewhere classified Information Systems not elsewhere classified text &# 8212 suboptimal user experiences novel optimization strategy multimodal interaction scenarios modal correlation matrix keypoint detection ), intelligent interaction design iemocap )&# 8212 experimental results show dynamic weighting mechanism advancements contribute meaningfully multimodal information fusion textual data relevant structured matching models multimodal information integration integration efficiency increases correlation matrix constraint low computational efficiency integration process integrating information matching process data volume computational complexity xlink "> temporal alignment study introduces study aims significantly reduced significant challenge optimized kuhn often leading munkres algorithm modalities differ improved kuhn importance scores feature extraction experience ratings critical contribution computer collaboration broad applicability baseline method also incorporating ablation studies 7 %, 4 %. |
| status_str | publishedVersion |
| title | Adjustment step size. |
| title_full | Adjustment step size. |
| title_fullStr | Adjustment step size. |
| title_full_unstemmed | Adjustment step size. |
| title_short | Adjustment step size. |
| title_sort | Adjustment step size. |
| topic | Biophysics Cancer Science Policy Space Science Biological Sciences not elsewhere classified Information Systems not elsewhere classified text &# 8212 suboptimal user experiences novel optimization strategy multimodal interaction scenarios modal correlation matrix keypoint detection ), intelligent interaction design iemocap )&# 8212 experimental results show dynamic weighting mechanism advancements contribute meaningfully multimodal information fusion textual data relevant structured matching models multimodal information integration integration efficiency increases correlation matrix constraint low computational efficiency integration process integrating information matching process data volume computational complexity xlink "> temporal alignment study introduces study aims significantly reduced significant challenge optimized kuhn often leading munkres algorithm modalities differ improved kuhn importance scores feature extraction experience ratings critical contribution computer collaboration broad applicability baseline method also incorporating ablation studies 7 %, 4 %. |