


default search action
MMM 2025, Nara, Japan - Part II
- Ichiro Ide

, Ioannis Kompatsiaris
, Changsheng Xu
, Keiji Yanai
, Wei-Ta Chu
, Naoko Nitta, Michael Riegler
, Toshihiko Yamasaki
:
MultiMedia Modeling - 31st International Conference on Multimedia Modeling, MMM 2025, Nara, Japan, January 8-10, 2025, Proceedings, Part II. Lecture Notes in Computer Science 15521, Springer 2025, ISBN 978-981-96-2060-9
Regular Papers
- Yixiao Xu, Yubo Li, Wanzhao Xu, Yicheng Gu, Yun Wang

, Jiangyuan Ma, Zhengwei Qi:
gFlow: Distributed Real-Time Reverse Remote Rendering System Model. 3-16 - Jiaxing Chen, Yuxuan Liu

, Dehu Li, Xiang An
, Weimo Deng, Ziyong Feng
, Yongle Zhao, Yin Xie:
Grounding Deliberate Reasoning in Multimodal Large Language Models. 17-30 - Shuijing Zheng, Suxi Yu, Yi Wang, Jing Wen:

GWUNet: A UNet with Gated Attention and Improved Wavelet Transform for Thyroid Nodules Segmentation. 31-44 - Liang-Chia Chen, Wei-Ta Chu

:
HCV: Lightweight Hybrid CNN-Vision Transformer for Visual Object Tracking. 45-59 - Alex Falcon

, Ali Abdari
, Giuseppe Serra
:
HierArtEx: Hierarchical Representations and Art Experts Supporting the Retrieval of Museums in the Metaverse. 60-73 - Yuyao Ye, Jiayu Yang, Yang Zhao, Mengping Gao, Hongbin Cao, Ronggang Wang:

Hybrid Scalable Video Coding with Neural Compression and Enhancement for Streaming Media. 74-86 - Jingkun Li, Na Qi, Qing Zhu:

Hyper-NeuS: Hypernetworks for Neural SDF Implicit Surface Reconstruction by Volume Rendering. 87-100 - Vu Thi Ngoc Anh, Yoshiyuki Shoji

, Yuma Oe, Huu-Long Pham, Hiroaki Ohshima
:
Image-Generation AI Model Retrieval by Contrastive Learning-Based Style Distance Calculation. 101-114 - Miguel Perez

, Holger Kirchhoff
, Peter Grosche
, Xavier Serra
:
Improving Singing Voice Transcription Generalization with AI Generated Accompaniments. 115-128 - Xiuhong Li, Xinyue Zhu, Boyuan Li, Songlin Li, Luyao Wang, Zhenhong Jia:

Infrared Small Target Detection with Feature Refinement and Context Enhancement. 129-140 - Wolfgang Hürst

, Yannick Visser:
Innovative Lifelog Visualization and Exploration in Virtual Reality - A Comparative Study. 141-154 - Xiukang Yang, Jingguo Ge, Hui Li, Liangxiong Li, Bingzhen Wu:

Integrating S1 &S2 Framework for Enhanced Semantic Match in Person Re-identification. 155-168 - Xiang Tian, Yuan Zhang, Chang Mu, Ziyang Zhang:

Intra-class Compact Facial Expression Recognition Based on Amplitude Phase Separation. 169-182 - Fei Wu, Ruixuan Zhou, Yimu Ji, Xiao-Yuan Jing:

Joint Decision Network with Modality-Specific and Dual Interactive Features for Fake News Detection. 183-196 - Kosetsu Tsukuda, Takumi Takahashi, Keisuke Ishida, Masahiro Hamasaki, Masataka Goto:

Kiite World: Socializing Map-Based Music Exploration Through Playlist Sharing and Synchronized Listening. 197-211 - Honghui Yuan

, Keiji Yanai
:
KuzushijiDiffuser: Japanese Kuzushiji Font Generation with FontDiffuser. 212-225 - Jingyao Zhang

, Shijie Hao, Fuming Sun, Yuan Rao:
LIESA: Low-Light Image Enhancement with Semantic Awareness. 226-239 - Jiajie Liu, Zhibin Zhang:

Lightweight Dual Grouped Large-Kernel Convolutions for Salient Object Detection Network. 240-253 - Ilhwan Kwon

, Jun Li
, Rajiv Ratn Shah
, Mukesh Prasad
:
Lightweight Motion-Aware Video Super-Resolution for Compressed Videos. 254-267 - Tatsumi Sunada

, Kaede Shiohara, Ling Xiao
, Toshihiko Yamasaki
:
LITA: LMM-Guided Image-Text Alignment for Art Assessment. 268-281 - Qing Wang, Chong-Wah Ngo, Ee-Peng Lim

, Qianru Sun:
LLMs-Based Augmentation for Domain Adaptation in Long-Tailed Food Datasets. 282-295 - Jiahua Si, Youze Wang, Wenbo Hu, Qiang Liu, Richang Hong:

Making Strides Security in Multimodal Fake News Detection Models: A Comprehensive Analysis of Adversarial Attacks. 296-309 - Deli Zhu

, Zhao Xu, Yunong Yang
:
MambaTalk: Speech-Driven 3D Facial Animation with Mamba. 310-323 - Jingdong Wang, Xu Ding, Fanqi Meng:

MC-YOLO: Multi-scale Transmission Line Defect Target Recognition Network. 324-337 - Hao Yan, Jing Bai

:
MDT-Net: A Mask Decoder Tuning Strategy for CLIP-Based Zero-Shot 3D Classification. 338-350 - Zepu Yi

, Songfeng Lu, Xueming Tang, Jianxin Zhu, Junjun Wu:
MICAN: Multi-modal Inconsistency-Based Cooperation Attention Network for Fake News Detection. 351-363 - Yaling Hao

, Wei Wu
:
MineTinyNet-YOLO: An Efficient Small Object Detection Method for Complex Underground Coal Mine Scenarios. 364-378 - Xin Lim, Lai-Kuan Wong, Yuen Peng Loh, Ke Gu, Weisi Lin:

Mix-YOLONet: Deep Image Dehazing for Improving Object Detection. 379-393 - Jiahao Zhang

, Xiao Zhao, Guangyu Gao
:
MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms. 394-407 - Zeyu Cai, Can Zhang, Yuchong Chen

, Xunhao Chen, Jiming Yang, Wubin Shi, Feipeng Da, Chengqian Jin:
MLP-AMDC: A MLP Architecture for Adaptive-Mask-Based Dual-Camera Snapshot Hyperspectral Imaging. 408-423 - Junhao Guo, Chenhan Fu, Guoming Wang, Rongxing Lu, Dong Chen, Siliang Tang

:
MM-CARP: Multimodal Model with Cross-Modal Retrieval-Augmented and Visual Region Perception. 424-437 - Guohui Ding, Zhonghua Li, Yongqiang Ren:

Modality-Specific Hashing: Transform Cross-Modal Retrieval Into Single-Modal Retrieval. 438-451

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














