default search action
29th MMM 2023: Bergen, Norway - Part I
- Duc-Tien Dang-Nguyen, Cathal Gurrin, Martha A. Larson, Alan F. Smeaton, Stevan Rudinac, Minh-Son Dao, Christoph Trattner, Phoebe Chen:
MultiMedia Modeling - 29th International Conference, MMM 2023, Bergen, Norway, January 9-12, 2023, Proceedings, Part I. Lecture Notes in Computer Science 13833, Springer 2023, ISBN 978-3-031-27076-5
Detection, Recognition and Identification
- Ziyan Liao, Dening Di, Jingsong Hao, Jiang Zhang, Shulei Zhu, Jun Yin:
MMM-GCN: Multi-Level Multi-Modal Graph Convolution Network for Video-Based Person Identification. 3-15 - Chong-Jian Zhang, Song-Lu Chen, Qi Liu, Zhi-Yong Huang, Feng Chen, Xu-Cheng Yin:
Feature Enhancement and Reconstruction for Small Object Detection. 16-27 - Zhiyong Zhou, Yuanning Liu, Xiaodong Zhu, Shuai Liu, Shaoqiang Zhang, Zhen Liu:
Toward More Accurate Heterogeneous Iris Recognition with Transformers and Capsules. 28-40 - Xiaotian Wang, Letian Zhao, Wei Wu, Xi Jin:
MCANet: Multiscale Cross-Modality Attention Network for Multispectral Pedestrian Detection. 41-53
Human Action Understanding
- Yibo Hu, Chenyu Cao, Fangtao Li, Chenghao Yan, Jinsheng Qi, Bin Wu:
Overall-Distinctive GCN for Social Relation Recognition on Videos. 57-68 - Haoran Ren, Hao Ren, Hong Lu, Cheng Jin:
Weakly-Supervised Temporal Action Localization with Regional Similarity Consistency. 69-81 - Yanrui Niu, Jingyao Yang, Chao Liang, Baojin Huang, Zhongyuan Wang:
A Spatio-Temporal Identity Verification Method for Person-Action Instance Search in Movies. 82-94 - Hongfeng Han, Zhiwu Lu, Ji-Rong Wen:
Binary Neural Network for Video Action Recognition. 95-106
Image Quality Assessment and Enhancement
- Bowen Wan, Daming Shi, Yukun Liu:
STN: Stochastic Triplet Neighboring Approach to Self-supervised Denoising from Limited Noisy Images. 109-120 - Haodian Wang, Yang Wang, Yang Cao, Zheng-Jun Zha:
Fusion-Based Low-Light Image Enhancement. 121-133 - Ailin Li, Lei Zhao, Zhiwen Zuo, Zhizhong Wang, Wei Xing, Dongming Lu:
Towards Interactive Facial Image Inpainting by Text or Exemplar Image. 134-148 - Yihua Chen, Zhiyuan Chen, Mengzhu Yu, Zhenjun Tang:
Dual-Feature Aggregation Network for No-Reference Image Quality Assessment. 149-161
Multimedia Analytics Application
- Jiaying Lan, Lianglun Cheng, Guoheng Huang, Chi-Man Pun, Xiaochen Yuan, Shangyu Lai, Hongrui Liu, Wing-Kuen Ling:
Single Cross-domain Semantic Guidance Network for Multimodal Unsupervised Image Translation. 165-177 - Itthisak Phueaksri, Marc A. Kastner, Yasutomo Kawanishi, Takahiro Komamizu, Ichiro Ide:
Towards Captioning an Image Collection from a Combined Scene Graph Representation Approach. 178-190 - Jianghai Wang, Menghao Hu, Yaguang Song, Xiaoshan Yang:
Health-Oriented Multimodal Food Question Answering. 191-203 - Golsa Tahmasebzadeh, Eric Müller-Budack, Sherzod Hakimov, Ralph Ewerth:
MM-Locate-News: Multimodal Focus Location Estimation in News. 204-216
Multimedia Content Generation
- Jiyun Li, Yuan Gao, Chen Qian, Jiachen Lu, Zhongqin Chen:
C-GZS: Controllable Person Image Synthesis Based on Group-Supervised Zero-Shot Learning. 219-230 - Fan Zhang, Naye Ji, Fuxing Gao, Yongping Li:
DiffMotion: Speech-Driven Gesture Synthesis Using Denoising Diffusion Model. 231-242 - Dongjin Huang, Yue Zhang, Zhenyan Li, Jinhua Liu:
TG-Dance: TransGAN-Based Intelligent Dance Generation with Music. 243-254 - Zi Chai, Xiaojun Wan, Soyeon Caren Han, Josiah Poon:
Visual Question Generation Under Multi-granularity Cross-Modal Interaction. 255-266
Multimodal and Multidimensional Imaging Application
- Haoyi Xiu, Xin Liu, Weimin Wang, Kyoung-Sook Kim, Takayuki Shinohara, Qiong Chang, Masashi Matsuoka:
Optimizing Local Feature Representations of 3D Point Clouds with Anisotropic Edge Modeling. 269-281 - Tao Wen, Chao Liang, You-Ming Fu, Chun-Xia Xiao, Hai-Ming Xiang:
Floor Plan Analysis and Vectorization with Multimodal Information. 282-293 - Pengwei Tang, Huayi Tang, Wei Wang, Yong Liu:
Safe Contrastive Clustering. 294-305 - Shufan Dai, Yangjie Cao, Pengsong Duan, Xianfu Chen:
SRes-NeRF: Improved Neural Radiance Fields for Realism and Accuracy of Specular Reflections. 306-317
Real-Time and Interactive Application
- Zhi-Yong Huang, Song-Lu Chen, Qi Liu, Chong-Jian Zhang, Feng Chen, Xu-Cheng Yin:
LiteHandNet: A Lightweight Hand Pose Estimation Network via Structural Feature Enhancement. 321-333 - Nikhil Kumar Tomar, Debesh Jha, Ulas Bagci:
DilatedSegNet: A Deep Dilated Segmentation Network for Polyp Segmentation. 334-344 - Hsin-Hung Chen, Alexander Lerch:
Music Instrument Classification Reprogrammed. 345-357 - Mingqi Chen, Shaodong Li, Feng Shuang, Kai Luo:
Cascading CNNs with S-DQN: A Parameter-Parsimonious Strategy for 3D Hand Pose Estimation. 358-369
ICDAR: Intelligent Cross-Data Analysis and Retrieval
- Yuzhe Hao, Kuniaki Uto, Asako Kanezaki, Ikuro Sato, Rei Kawakami, Koichi Shinoda:
EvIs-Kitchen: Egocentric Human Activities Recognition with Video and Inertial Sensor Data. 373-384 - Longlong Zhou, Xiaojun Wu, Tianyang Xu:
COMIM-GAN: Improved Text-to-Image Generation via Condition Optimization and Mutual Information Maximization. 385-396 - Jakub Lokoc, Ladislav Peska:
A Study of a Cross-modal Interactive Search Tool Using CLIP and Temporal Fusion. 397-408 - Dinh-Duy Pham, Minh-Son Dao, Thanh-Binh Nguyen:
A Cross-modal Attention Model for Fine-Grained Incident Retrieval from Dashcam Videos. 409-420 - Mingliang Liang, Zhuoran Liu, Martha A. Larson:
Textual Concept Expansion with Commonsense Knowledge to Improve Dual-Stream Image-Text Matching. 421-433 - Alireza Hossein Zadeh Nik, Michael A. Riegler, Pål Halvorsen, Andrea M. Storås:
Generation of Synthetic Tabular Healthcare Data Using Generative Adversarial Networks. 434-446 - Quoc-Cuong Le, Minh-Quan Le, Mai-Khiem Tran, Ngoc-Quyen Le, Minh-Triet Tran:
FL-Former: Flood Level Estimation with Vision Transformer for Images from Cameras in Urban Areas. 447-459
MDRE: Multimedia Datasets for Repeatable Experimentation
- Tsung-Han Ho, Chen-Yin Yu, Tsai-Yen Ko, Wei-Ta Chu:
The NCKU-VTF Dataset and a Multi-scale Thermal-to-Visible Face Synthesis System. 463-475 - Viktor Lakic, Luca Rossetto, Abraham Bernstein:
Link-Rot in Web-Sourced Multimedia Datasets. 476-488 - Werner Bailer, Hannes Fassold:
People@Places and ToDY: Two Datasets for Scene Classification in Media Production and Archiving. 489-501 - Michael A. Riegler, Vajira Thambawita, Ayan Chatterjee, Binh T. Nguyen, Steven Alexander Hicks, Vibeke Telle-Hansen, Svein Arne Pettersen, Dag Johansen, Ramesh C. Jain, Pål Halvorsen:
ScopeSense: An 8.5-Month Sport, Nutrition, and Lifestyle Lifelogging Dataset. 502-514 - Yuan Lin, Zhaoqi Chu, Jari Korhonen, Jiayi Xu, Xiangrong Liu, Juan Liu, Min Liu, Lvping Fang, Weidi Yang, Debasish Ghose, Junyong You:
Fast Accurate Fish Recognition with Deep Learning Based on a Domain-Specific Large-Scale Fish Dataset. 515-526 - Maarten Sukel, Stevan Rudinac, Marcel Worring:
GIGO, Garbage In, Garbage Out: An Urban Garbage Classification Dataset. 527-538 - Quang-Trung Truong, Tuan-Anh Vu, Tan-Sang Ha, Jakub Lokoc, Yue Him Tim Wong, Ajay Joneja, Sai-Kit Yeung:
Marine Video Kit: A New Marine Video Dataset for Content-Based Analysis and Retrieval. 539-550
SNL: Sport and Nutrition Lifelogging
- Tor-Arne S. Nordmo, Michael A. Riegler, Håvard D. Johansen, Dag Johansen:
Arctic HARE: A Machine Learning-Based System for Performance Analysis of Cross-Country Skiers. 553-564 - Matthias Boeker, Cise Midoglu:
Soccer Athlete Data Visualization and Analysis with an Interactive Dashboard. 565-576 - Bjørn Aslak Juliussen, Jon Petter Rui, Dag Johansen:
Sport and Nutrition Digital Analysis: A Legal Assessment. 577-588 - Nitish Nagesh, Iman Azimi, Tom Andriola, Amir M. Rahmani, Ramesh C. Jain:
Towards Deep Personal Lifestyle Models Using Multimodal N-of-1 Data. 589-600 - Aakash Sharma, Katja Pauline Czerwinska, Dag Johansen, Håvard D. Johansen:
Capturing Nutrition Data for Sports: Challenges and Ethical Issues. 601-612
VBS: Video Browser Showdown
- Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, Nicola Messina, Lucia Vadicamo, Claudio Vairo:
VISIONE at Video Browser Showdown 2023. 615-621 - Florian Spiess, Silvan Heller, Luca Rossetto, Loris Sauter, Philipp Weber, Heiko Schuldt:
Traceable Asynchronous Workflows in Video Retrieval with vitrivr-VR. 622-627 - Jakub Lokoc, Zuzana Vopálková, Patrik Dokoupil, Ladislav Peska:
Video Search with CLIP and Interactive Text Query Reformulation. 628-633 - Sebastian Lubos, Massimiliano Rubino, Christian Tautschnig, Markus Tautschnig, Boda Wen, Klaus Schoeffmann, Alexander Felfernig:
Perfect Match in Video Retrieval. 634-639 - Weixi Song, Jiangshan He, Xinghan Li, Shiwei Feng, Chao Liang:
QIVISE: A Quantum-Inspired Interactive Video Search Engine in VBS2023. 640-645 - Loris Sauter, Ralph Gasser, Silvan Heller, Luca Rossetto, Colin Saladin, Florian Spiess, Heiko Schuldt:
Exploring Effective Interactive Text-Based Video Search in vitrivr. 646-651 - Nhat Hoang-Xuan, E-Ro Nguyen, Thang-Long Nguyen-Ho, Minh-Khoi Pham, Quang-Thuc Nguyen, Hoang-Phuc Trang-Trung, Van-Tu Ninh, Tu-Khiem Le, Cathal Gurrin, Minh-Triet Tran:
V-FIRST 2.0: Video Event Retrieval with Flexible Textual-Visual Intermediary for VBS 2023. 652-657 - Nick Pantelidis, Stelios Andreadis, Maria Pegia, Anastasia Moumtzidou, Damianos Galanopoulos, Konstantinos Apostolidis, Despoina Touska, Konstantinos Gkountakos, Ilias Gialampoukidis, Stefanos Vrochidis, Vasileios Mezaris, Ioannis Kompatsiaris:
VERGE in VBS 2023. 658-664 - Konstantin Schall, Nico Hezel, Klaus Jung, Kai Uwe Barthel:
Vibro: Video Browsing with Semantic and Visual Image Embeddings. 665-670 - Thao-Nhu Nguyen, Bunyarit Puangthamawathanakun, Annalina Caputo, Graham Healy, Binh T. Nguyen, Chonlameth Arpnikanondt, Cathal Gurrin:
VideoCLIP: An Interactive CLIP-based Video Retrieval System at VBS2023. 671-677 - Rahel Arnold, Loris Sauter, Heiko Schuldt:
Free-Form Multi-Modal Multimedia Retrieval (4MR). 678-683 - Klaus Schoeffmann, Daniela Stefanics, Andreas Leibetseder:
diveXplore at the Video Browser Showdown 2023. 684-689 - Zhixin Ma, Jiaxin Wu, Weixiong Loo, Chong-Wah Ngo:
Reinforcement Learning Enhanced PicHunter for Interactive Search. 690-696
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.