PhysFormer++: Facial Video-Based Physiological Measurement with SlowFast Temporal Difference Transformer
Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e.g., remote healthcare and affective computing). Recent deep ...
Semantics-Guided Intra-Category Knowledge Transfer for Generalized Zero-Shot Learning
Zero-shot learning (ZSL) requires one to associate visual and semantic information observed from data of seen classes, so that test data of unseen classes can be recognized based on the described semantic representation. Aiming at synthesizing ...
SMG: A Micro-gesture Dataset Towards Spontaneous Body Gestures for Emotional Stress State Analysis
We explore using body gestures for hidden emotional state analysis. As an important non-verbal communicative fashion, human body gestures are capable of conveying emotional information during social communication. In previous works, efforts have ...
Context-Driven Detection of Invertebrate Species in Deep-Sea Video
Each year, underwater remotely operated vehicles (ROVs) collect thousands of hours of video of unexplored ocean habitats revealing a plethora of information regarding biodiversity on Earth. However, fully utilizing this information remains a ...
Improved 3D Markerless Mouse Pose Estimation Using Temporal Semi-supervision
Three-dimensional markerless pose estimation from multi-view video is emerging as an exciting method for quantifying the behavior of freely moving animals. Nevertheless, scientifically precise 3D animal pose estimation remains challenging, ...
Semi-supervised Visual Tracking of Marine Animals Using Autonomous Underwater Vehicles
In-situ visual observations of marine organisms is crucial to developing behavioural understandings and their relations to their surrounding ecosystem. Typically, these observations are collected via divers, tags, and remotely-operated or human-...
A Minimal Solution for Image-Based Sphere Estimation
We propose a novel minimal solver for sphere fitting via its 2D central projection, i.e., a special ellipse. The input of the presented algorithm consists of contour points detected in a camera image. General ellipse fitting problems require five ...
Refractive Pose Refinement: Generalising the Geometric Relation between Camera and Refractive Interface
In this paper, we investigate absolute and relative pose estimation under refraction, which are essential problems for refractive structure from motion. To cope with refraction effects, we first formulate geometric constraints for establishing ...
Deep Memory-Augmented Proximal Unrolling Network for Compressive Sensing
Mapping a truncated optimization method into a deep neural network, deep proximal unrolling network has attracted attention in compressive sensing due to its good interpretability and high performance. Each stage in such networks corresponds to ...
Through Hawks’ Eyes: Synthetically Reconstructing the Visual Field of a Bird in Flight
Birds of prey rely on vision to execute flight manoeuvres that are key to their survival, such as intercepting fast-moving targets or navigating through clutter. A better understanding of the role played by vision during these manoeuvres is not ...
Multi-view Tracking, Re-ID, and Social Network Analysis of a Flock of Visually Similar Birds in an Outdoor Aviary
The ability to capture detailed interactions among individuals in a social group is foundational to our study of animal behavior and neuroscience. Recent advances in deep learning and computer vision are driving rapid progress in methods that can ...
Learning Enriched Hop-Aware Correlation for Robust 3D Human Pose Estimation
Graph convolution networks (GCNs) based methods for 3D human pose estimation usually aggregate immediate features of single-hop nodes, which are unaware of the correlation of multi-hop nodes and therefore neglect long-range dependency for ...
RELAX: Representation Learning Explainability
- Kristoffer K. Wickstrøm,
- Daniel J. Trosten,
- Sigurd Løkse,
- Ahcène Boubekki,
- Karl øyvind Mikalsen,
- Michael C. Kampffmeyer,
- Robert Jenssen
Despite the significant improvements that self-supervised representation learning has led to when learning from unlabeled data, no methods have been developed that explain what influences the learned representation. We address this need through ...