


default search action
21st VISAPP 2026: Marbella, Spain - Volume 2
- Antonino Furnari, Petia Radeva:

Proceedings of the 21st International Conference on Computer Vision Theory and Applications, VISAPP 2026, Marbella, Spain, March 9-11, 2026, Volume 2. SCITEPRESS 2026, ISBN 978-989-758-804-4
Invited Speakers
- Elisa Ricci:

Rethinking Multimodal AI Models: Beyond Accuracy, Towards Trust. VISAPP 2026: 3 - Yuki Asano:

Self-Supervised Learning Is Dead, Long Live Self-Supervised Learning (in the Age of MLLMs). VISAPP 2026: 5 - Cees G. M. Snoek:

Seeing, Speaking, and Reasoning in a Visual World. VISAPP 2026: 7
Recognition & Detection
- Navid Aslankhani Khameneh, Michela Farenzena, Marco Carletti:

Toward Automated Bed Safety: Comparative Study of Vision Algorithms for Bed-Rail State Detection. 13-21 - Thorsten Herd, Philipp Heidenreich, Christoph Stiller:

Simplify Your Fusion: Reducing Complexity for Multimodal Sensor Fusion. 22-30 - Georg Siedel, Rojan Regmi, Abhirami Anand, Weijia Shao, Silvia Vock, Andrey Morozov:

Stylized Synthetic Augmentation Further Improves Corruption Robustness. 31-42 - Victor Bercy

, Martyna Poreba, Michal Szczepanski, Samia Bouchafa:
G2TM: Single-Module Graph-Guided Token Merging for Efficient Semantic Segmentation. 43-54 - Harinadh Sivaramakrishna Patamsetti, Kamakshya Prasad Nayak, Kamalakar Vijay Thakare, Debi Prosad Dogra, Heeseung Choi, Haesol Park, Ig-Jae Kim:

Group Emotion Recognition Explained! A Human-Centric Approach Using Multi-Modal VGER Dataset and Framework. 55-66 - Alan Klinger Sousa Alves, Bruno Motta de Carvalho:

UMergeNet: Exploring Lightweight Mechanisms for High-Performance Semantic Segmentation. 67-78 - Tony Montes, Fernando Lozano:

ViQAgent: Zero-Shot Video Question Answering via Agent with Open-Vocabulary Grounding Validation. 79-90 - Alokeparna Choudhury, Sourav Samanta, Sanjoy Pratihar, Oishila Bandyopadhyay:

Multi-Objective Quantum-Inspired Firefly Algorithm for Medical Image Segmentation. 91-102 - Alex Alonso Dalbøl

, Christofer Meinecke, Stefan Jänicke:
Human-in-the-Loop Refinement of Zero-Shot Object Detection for Domain-Specific Artwork Datasets. 103-116 - Serin Varghese, Kevin Ross, Fabian Hüger, Kira Maag

:
Spatio-Temporal Attention for Consistent Video Semantic Segmentation in Automated Driving. 117-127 - Henry O. Velesaca, Andrea Mero, Rafael E. Rivadeneira, Guillermo A. Castillo, Ángel D. Sappa:

AVNet: Cross-Spectral Attention-Vision Model for Camouflaged Object Detection in Ecological Conservation. 128-137 - Stuart A. Montes, Edward Cayllahua, Rensso Mora Colque

:
A Simple and Lightweight Model for Person Re-Identification. 138-148 - Azuma Miura, Hideaki Uchiyama, Masahiro Yamaguchi, Natsuki Kai, Takahiro Shiroshima, Hideo Saito:

Real-Time 6DoF Pallet Pose Estimation with Monocular Metric Depth. 149-161 - Evangelos G. Sartinas, Dimitrios I. Kosmopoulos, Emmanouil Z. Psarakis, Kostas Blekos, Bikram Kumar De, Vangelis Metsis:

Geometric Skeletal Distance Learning for Self-Supervised Sign Language Recognition. 162-173 - Afshin Dini

, Farnaz Delirie, Esa Rahtu:
HyperNut: Hyper Spectral Dataset of Nuts for Unsupervised Defect Detection and Segmentation. 177-184 - Emmanuel Morales-García, Saúl Domínguez-Isidro, José Rafael Rojano-Cáceres, Cecilia Cruz-López:

Trade-Offs of Contrast Enhancement and Denoising for Low-Resolution Images: An Empirical Study on DIV2K. 185-192 - Mariana L. Teixeira, Hugo S. Oliveira, Raquel L. Monteiro, Daniela Ferreira Santos, Tânia Pereira, Raphaël F. Canadas, Hélder P. Oliveira:

Robust Cell Segmentation in Urine Cytology Images for Bladder Cancer Diagnosis. 193-200 - Perry op't Landt, Ulrich Krispel, Torsten Ullrich:

Domain-Specific Synthetic Data Generation for Person Re-Identification in Public Transport. 201-209 - Chenchang Liu, Svetlana Ionova, Patrick Mäder, Marco Seeland:

Order-Level Arthropod Detection Using Deep Learning: Addressing Scale Variability through Synthetic Data. 210-217 - Junyu Xiao, Shohei Yokoyama:

Enhancing Temporal Stability in Small Object Detection: A Post-Processing Approach for YOLOv8. 218-225 - Christian Matos Rivera, Rémi Mégret, David Flores:

A Parametric 3D Bee Model for Scalable Synthetic Data Generation in Animal Behavior Studies. 226-233 - Thibault Geoffroy, Gauthier Gerspacher, Lionel Prevost:

High Semantic Features for the Continual Learning of Complex Emotions: A Lightweight Solution. 234-242 - James Rainey, John Wannan, Douglas MacLachlan, Boguslaw Obara, Deepayan Bhowmik:

INFORM: A Food Monitoring and Tracking System for Sustainable Healthcare Facilities. 243-250 - Ugur Can Karaca, Toygar Akgün:

An Automated Deep Learning Pipeline for Panel Segmentation, Defect Detection, and 3D Reconstruction in Impact Test Analysis. 251-258 - Yuji Makimura, Daichi Hayashi, Teppei Kobiki, Shouki Sakemoto, Masashi Nishiyama:

Estimating Person Positions Using a Camera and Wireless Devices in a Space with Temporary Shielding. 259-266 - Julia Frick, Patrick Laufer, Roman Seidel, Gangolf Hirtz:

Synthetic Data in the Context of Automotive In-Cabin: A Review. 267-275 - Chafic Abou Akar, Hadi Koubeissy, Joe Khalil, Jimmy Tekli, Marc Kamradt, Abdallah Makhoul:

Rendered vs AI-Generated vs Real Data: Training Industrial Object Detection Models with Multi-Source Data. 276-287 - Sarina Penquitt, Jonathan Klees, Rinor Cakaj, Daniel Kondermann, Matthias Rottmann, Lars Schmarje:

From Label Error Detection to Correction: A Modular Framework and Benchmark for Object Detection Datasets. 288-299 - Stefan Geyer, Vicky Kalogeiton, Alina Roitberg:

When Surgery Meets the Unknown: Uncertainty-Aware Open-Set Recognition for Surgery Phase Classification. 300-307 - Leonardo Adriano Vasconcelos de Oliveira, José Daniel de Alencar Santos, Francisco Nélio Costa Freitas, Pedro Pedrosa Rebouças Filho, Hamilton Ferreira Gomes de Abreu, Luis Flávio Gaspar Herculano:

Microstructural Classification of Non-Oriented Electrical Steels: A Machine Learning and Computer Vision-Based Approach. 308-315 - Gabriel Souto Ferrante, Priscila Tiemi Maeda Saito:

ReAL-YOLO: Reinforcement-Driven Active Learning for Training YOLO Object Detectors. 316-323 - Paulo Henrique Bueno Lopes, Leandro A. F. Fernandes:

ASRDNet: A New Image-Segmentation Neural Network Model for Detecting Asian Rust on Soybean Leaves. 324-331 - Brian Luís Coimbra Maia, Lucas Silva Santana, Gabriel Rezende da Silva, Mathews Edwirds Gomes Almeida, Paulo Victor de Magalhães Rozatto, Luiz Maurílio da Silva Maciel, Saulo Moraes Villela, Marcelo Bernardes Vieira, Bruno Campos de Carvalho:

Class-Based Adaptive Training for Facial Expression Recognition Using a Deep Convolutional Network. 332-340 - Simone Bello Kaminski Aires, Maria Eduarda Guedes Pinto Gianisella, Simone Nasser Matos, Marcos Vinicius Santos Passos, Thales Janisch Santos

:
Computer Vision-Assisted Literacy: Recognizing Children's Handwritten Words. 341-350 - Amal Ben Amor

, Sonia Mosbah, Jihene Hlel:
Palm Trees Meets Deep Learning: A Real-Time Detection of Boufaroua with YOLOv8n and Embedded Optics. 351-356 - Zehua Liu, Tilo Burghardt:

Long-Tailed Species Recognition in the NACTI Wildlife Dataset. 357-365 - Luiz Fernando Merli de Oliveira Sementille, Marcos Cleison Silva Santana, Douglas Rodrigues, Danilo Samuel Jodas, Kelton Augusto Pontara da Costa, Fabio Luiz de Oliveira, João Paulo Papa:

Retail Shelf Monitoring Using Deep Hough Transform and Object Detection. 366-373 - Josep S. Sánchez, Jose Luis Lisani, Antoni Bennasar Garau:

Improving Underwater Fish Detection and Tracking via Temporal Modeling and Deformable Convolutions. 374-381 - Nassim Kaddouri, Afifa Dahmane, Hadja Faiza Khellaf-Haned:

Hepatic Vessel Segmentation: A Narrative Review. 382-389 - Hinako Mitsuoka

, Kazuhiro Hotta:
Hazard-Aware Duration Prior and F1-Adaptive Loss for Temporal Action Segmentation. 390-397 - Warley Barbosa, Tiago F. Vieira:

Kinship Verification with Custom Sampling and Hard Contrastive Loss. 398-405 - Alp Eren Gençoglu, Hazim Kemal Ekenel:

Improved MambdaBDA Framework for Robust Building Damage Assessment across Disaster Domains. 406-414 - Taigo Sakai, Takeshi Nakamura, Hiroshi Shimizu, Kazuhiro Hotta:

Visibility-Gated ConvGRU for Robust Multi-View Pedestrian Detection and Tracking in BEV Space. 415-422 - Satoshi Kamiya, Kota Yamashita, Kazuhiro Hotta:

ASO PatchCore: Memory-Efficient and Fast Anomaly Detection via Automatic Sampling Optimization. 423-430 - Mathijs Lens, Toon Goedemé:

Gabor-Guided Tiny Bird Detection for Drone-Based Agricultural Bird Deterrence. 431-438 - Uros Petkovic, Jonas Frenkel, Rebecca Lazarides, Olaf Hellwich:

Modeling Student Engagement in the Wild: Analysis from Classroom Video. 439-446 - Carlos Pizarroso, Zuzana Berger Haladová, Zuzana Cerneková, Viktor Kocur:

Billboard in Focus: Estimating Driver Gaze Duration from a Single Image. 447-453 - Douglas Costa Braga, Albert de Jesus Souza, Samuel Bastos Borges Pinho, Daniel Oliveira Dantas:

Classification of Normal versus Leukemic Cells with Swin Transformer and Balanced Data Augmentation. 454-460 - Alessia Micieli, Giovanni Maria Farinella, Francesco Ragusa:

SignIT: A Comprehensive Dataset and Multimodal Analysis for Italian Sign Language Recognition. 461-468 - Marcin Maciag, Grzegorz Sarwas:

Adversarial Robustness of Proxy-Based Metric Learning Models. 469-476 - Qingyu Wang, Xingzhen Song, Chungsheng Chang, Feiyu Ge, Eisuke Sato, Masato Tsukada:

EventAction: Vision Mamba-Based Event-Driven Action Recognition. 477-484 - Elisa Gonçalves Ribeiro, Rodrigo Moreira, Larissa Ferreira Rodrigues Moreira, André Ricardo Backes:

Generalizable Hyperparameter Optimization for Federated Learning on Non-IID Cancer Images. 485-492 - Lei Lei, Marc Lalonde, Hamed Ghodrati, Azur Handan:

An Ensemble Approach to Climate Misinformation Detection. 493-499 - Ruslan Zaripov, Anastasiya Shpileva, Maksim Koltakov, Georgii Petrov, Viacheslav Shalamov, Valeria Efimova:

Efficient Transformer-Based Spatio-Temporal Action Recognition for Industrial Safety Surveillance. 500-508 - Damian Kmiecik

, Adrian Dziembowski:
A Deterministic Edge-AI System for Early Wildfire Smoke Detection: From Lightweight Neural Models to Operationally Reliable Surveillance. 509-517 - Charis Hanna, Kasim Terzic, Mark James

, Karen Spencer
:
IoMB: Evaluating Object Detectors on Occluded and Imbalanced Seabird Populations. 518-525 - Yuma Kokubu, Tomokazu Ishikawa:

Comparative Evaluation of Vision Transformer Architectures for Video-Based Weather Intensity Recognition. 526-533
Low-Level Vision & Computational Imaging
- Mohamed Awad, Mahmoud Mohamed, Walid Gomaa:

Defense that Attacks: How Robust Models Become Better Attackers. 539-549 - Ryota Maeda, Naoki Arikawa, Yutaka No, Shinsaku Hiura:

End-to-End Optimization of Polarimetric Measurement and Material Classifier. 550-559 - Om Chatterjee, Prachi Das, Amiya Kumar Bhowmik, Avighyan Chakraborty, Jacob Tauro, Sanjoy Pratihar:

Training-Free Plant Leaf Disease Severity Estimation Using Fuzzy C-Means Clustering and Reference Palette Validation. VISAPP (2) 2026: 560-571 - Seeha Lee, Dongyoung Choi, Min H. Kim:

Diffusion-Based HDR Reconstruction from Mosaiced Exposure Images. 572-583 - Him Kafle, Amit Banerjee:

Solving Large Square Jigsaw Puzzles with Quasi-Linear Candidate Selection Using DC-L1 Norm. 584-594 - José Ribamar Durand Rodrigues Júnior, Paulo Ivson Netto Santos, Aristófanes Corrêa Silva, Francisco de Assis Silva Neto, Carlos Rodriguez Suarez, Deane Roehl:

Horizontally-Aware Deep Network for Seismic Multiple Attenuation. 595-605 - Yuma Nakai, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi:

Adaptive IG-ODAM: Efficient Attribution Maps for Object Detection via Spatial-Guided Sampling. 606-614 - Ricardo A. Acuña-Villogas

, Germain García-Zanabria, Cristian Lopez, Rensso Mora Colque
:
Efficient Multi-Temporal Building Change Detection with Reduced-Rank Linear Attention. 615-624 - Martin Rydlo, Max Verbers

, Carlos Vega, Raquel León, Himar Fabelo, Gustav Burström, Alfonso Lagares, Jesús Morera, Gustavo M. Callicó, Francesca Manni, Svitlana Zinger:
Quality of Brain Tumour Detection in Hyperspectral Imaging Based on Ground Truth Representation. 627-634 - Wenjuan Zhou, Haibin Cai, Baihua Li, Qinggang Meng:

GazeUnconstrained: A Multimodal Dataset for Visual Attention and Gaze Estimation in Natural Video Viewing. 635-643 - Cristian Valero-Abundio

, Emilio Sansano-Sansano, Raúl Montoliu, Marina Martínez-García:
Computing a Characteristic Orientation for Rotation-Independent Image Analysis. 644-651 - Tarek Zenati:

Supervised Deep Feature-Based Industrial Defect Detection in Optical Lenses with Minimal Data. 652-660 - Simone Melcarne, Jean-Luc Dugelay:

Fusing Thermal and Event Data for Visible Spectrum Image Reconstruction. 661-669 - Atashi Saha, Sanjoy Pratihar:

Centrality-Driven Prufer-Code Based Graph Encoding to Capture Structural Relationships. 670-677 - Paulina Morillo, Christopher Morales, Christian Castro, Diego Vallejo-Huanga:

Similarity Analysis of AI-Generated Images from Spanish and English Prompts via Feature Extraction. 678-686 - Sibani Panigrahi, Debi Prosad Dogra, Jothi Ramalingam:

Improving Video Object Detection Performance in Rainy Conditions Using Edge Preserving Image Deraining. 687-694 - Ryosuke Komatsu, Masaya Oishi, Takafumi Iwaguchi, Hiroshi Kawasaki, Hiroyuki Kubo:

Depth Estimation in Scattering Media due to Synchronization Delay in Projector Camera Systems. 695-702 - Pin-Yuan Yang, Shih-Chieh Chang:

Event Camera Deraining Using Transformer Networks with Time-Space Attention. 703-714 - Ben Hamscher, Arnold Brosch, Nicolas Binninger, Maksymilian Jan Dejna, Kira Maag

:
Dance Style Classification Using Laban-Inspired and Frequency-Domain Motion Features. 715-723 - Artur Santos Nascimento, Daniel Oliveira Dantas, Valter Guilherme Silva de Souza, Beatriz Trinchão Andrade:

CP HDR: A Feature Point Detection and Description Library for LDR and HDR Images. 724-734 - Yoichi Furukawa, Takahiro Maruyama, Kazuhiro Hotta:

Self-Supervised Real-World Image Denoising with Noise-Level-Aware Dynamic Receptive Fields. 735-742 - Nicolai Skutsch, Olaf Hellwich, Frank Fuchs-Kittowski:

No Mountain, no Building, no Cue? Synthetic Data Generation of Digital Surface Models and Their Application to Visual Geo-Localization. VISAPP (2) 2026: 743-751 - Sofia Dorogova, Amir Shamsutdinov, Aleksei Khalin, Egor I. Ershov:

Video Denoising Still Needs Quality Benchmarks. 752-759 - Naoki Hashimoto, Ian Gomasaki, Ryosuke Saga:

Edge Bundling with Divergence and Convergence. 760-767 - Kelly Abreu, Aclecio Costa, Dibio Borges:

Few-Shot Data Sampling Strategies for Insect Pest Maturity Classification with YOLO Family Models. 768-775 - Ryohei Ohmori, Michitaka Yoshida, Ryo Kawahara, Takahiro Okabe:

Compensating Light Source Defocus for Per-Pixel Surface Roughness Measurement. 776-785 - T. Ashlesha, Vignesh, S. Varun, Vats Shubhangi, Narayan Surabhi:

Unveiling AI-Manipulated Medical Images: Detecting and Localizing Tampered Areas. 786-793 - Guillermo A. Castillo, Xavier Soria, Ángel D. Sappa:

SynShapes: A Synthetic Image-Annotation Dataset for Edge Detection. 794-802 - Lorenzo C. Maia, Priscila T. M. Saito, Pedro Henrique Bugatti:

Pixel-Driven Image Representation through Rule-Based Cellular Automata Dynamics. 803-810

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














