The overview framework of the coarse-to-fine alignment network.

<p>The CFAN consists of a video alignment module to learn the common visual-semantic space, a cross-modal interaction module to explore the fine-grained alignment among frames, proposals and video. Also, the multi-level coarse-to-fine alignment information flows between modules to make full us...

Full description

Saved in:

Bibliographic Details
Main Author:	Lingwen Meng (8968106) (author)
Other Authors:	Fangyuan Liu (1438045) (author), Mingyong Xin (15185747) (author), Siqi Guo (355869) (author), Fu Zou (21370430) (author)
Published:	2025
Subjects:	Marine Biology Science Policy Plasma Physics Infectious Diseases Biological Sciences not elsewhere classified sophisticated network architecture perform sufficient experiments evaluation results demonstrate grained alignment information global contextual information fine alignment network video alignment module fine alignment information numerous videos according modal interaction module level information video retrieval video moment video collections single video two subtasks temporal boundary task due original problem moment retrieval existing methods directly scale
Tags:	Add Tag No Tags, Be the first to tag this record!

The overview framework of the coarse-to-fine alignment network.

Similar Items