Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJanuary 2024
Deep Boosting Learning: A Brand-New Cooperative Approach for Image-Text Matching
IEEE Transactions on Image Processing (TIP), Volume 33Pages 3341–3352https://doi.org/10.1109/TIP.2024.3396063Image-text matching remains a challenging task due to heterogeneous semantic diversity across modalities and insufficient distance separability within triplets. Different from previous approaches focusing on enhancing multi-modal representations or ...
- research-articleJuly 2023
High-Performance Transformer Tracking
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 45, Issue 7Pages 8507–8523https://doi.org/10.1109/TPAMI.2022.3232535Correlation has a critical role in the tracking field, especially in recent popular Siamese-based trackers. The correlation operation is a simple fusion method that considers the similarity between the template and the search region. However, the ...
- research-articleApril 2023
Transformer vision-language tracking via proxy token guided cross-modal fusion
Pattern Recognition Letters (PTRL), Volume 168, Issue CPages 10–16https://doi.org/10.1016/j.patrec.2023.02.023Highlights- Proxy token guided transformer-based baseline for vision-language tracking.
- Dense annotated long-term vision-language tracking dataset.
- Extensive experiments on a new long-term vision-language tracking dataset.
Tracking by vision-language is an emergent topic. Previous researchers mainly adopt CNN and sequential models for video and language encoding, however, their methods are limited by poor generalization performance. To address this problem, this ...
- research-articleJanuary 2023
Plug-and-Play Regulators for Image-Text Matching
IEEE Transactions on Image Processing (TIP), Volume 32Pages 2322–2334https://doi.org/10.1109/TIP.2023.3266887Exploiting fine-grained correspondence and visual-semantic alignments has shown great potential in image-text matching. Generally, recent approaches first employ a cross-modal attention unit to capture latent region-word interactions, and then integrate ...
- research-articleOctober 2021
Weakly-Supervised Temporal Action Localization via Cross-Stream Collaborative Learning
MM '21: Proceedings of the 29th ACM International Conference on MultimediaPages 853–861https://doi.org/10.1145/3474085.3475261Weakly supervised temporal action localization (WTAL) is a challenging task as only video-level category labels are available during training stage. Without precise temporal annotations, most approaches rely on complementary RGB and optical flow ...
-
- ArticleAugust 2020
CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss
AbstractThis paper proposes a hierarchical loss for monocular depth estimation, which measures the differences between the prediction and ground truth in hierarchical embedding spaces of depth maps. In order to find an appropriate embedding space, we ...
- research-articleSeptember 2017
Ranking Saliency
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 39, Issue 9Pages 1892–1904https://doi.org/10.1109/TPAMI.2016.2609426Most existing bottom-up algorithms measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead ...
- research-articleApril 2017
Salient Object Detection via Multiple Instance Learning
IEEE Transactions on Image Processing (TIP), Volume 26, Issue 4Pages 1911–1922https://doi.org/10.1109/TIP.2017.2669878Object proposals are a series of candidate segments containing the objects of interest, which are taken as preprocessing and widely applied in various vision tasks. However, most of existing saliency approaches only utilizes the proposals to compute a ...
- research-articleJanuary 2017
Co-Bootstrapping Saliency
IEEE Transactions on Image Processing (TIP), Volume 26, Issue 1Pages 414–425https://doi.org/10.1109/TIP.2016.2627804In this paper, we propose a visual saliency detection algorithm to explore the fusion of various saliency models in a manner of bootstrap learning. First, an original bootstrapping model, which combines both weak and strong saliency models, is ...
- research-articleNovember 2016
Video anomaly detection based on locality sensitive hashing filters
Pattern Recognition (PATT), Volume 59, Issue CPages 302–311https://doi.org/10.1016/j.patcog.2015.11.018In this paper, we propose a novel anomaly detection approach based on Locality Sensitive Hashing Filters (LSHF), which hashes normal activities into multiple feature buckets with Locality Sensitive Hashing (LSH) functions to filter out abnormal ...
- research-articleOctober 2016
Query Adaptive Instance Search using Object Sketches
MM '16: Proceedings of the 24th ACM international conference on MultimediaPages 1306–1315https://doi.org/10.1145/2964284.2964317Sketch-based object search is a challenging problem mainly due to two difficulties: (1) how to match the binary sketch query with the colorful image, and (2) how to locate the small object in a big image with the sketch query. To address the above ...
- research-articleMarch 2016
Combining motion and appearance cues for anomaly detection
Pattern Recognition (PATT), Volume 51, Issue CPages 443–452https://doi.org/10.1016/j.patcog.2015.09.005In this paper, we present a novel anomaly detection framework which integrates motion and appearance cues to detect abnormal objects and behaviors in video. For motion anomaly detection, we employ statistical histograms to model the normal motion ...
- research-articleFebruary 2016
Sketch retrieval via local dense stroke features
Image and Vision Computing (IAVC), Volume 46, Issue CPages 64–73https://doi.org/10.1016/j.imavis.2015.11.007Sketch retrieval aims at retrieving the most similar sketches from a large database based on one hand-drawn query. Successful retrieval hinges on an effective representation of sketch images and an efficient search method. In this paper, we propose a ...
- research-articleOctober 2015
Salient object detection via global and local cues
Pattern Recognition (PATT), Volume 48, Issue 10Pages 3258–3267https://doi.org/10.1016/j.patcog.2014.12.005Previous saliency detection algorithms used to focus on low level features directly or utilize a bunch of sample images and manually labeled ground truth to train a high level learning model. In this paper, we propose a novel coding-based saliency ...
- ArticleDecember 2013
Saliency Detection via Dense and Sparse Reconstruction
ICCV '13: Proceedings of the 2013 IEEE International Conference on Computer VisionPages 2976–2983https://doi.org/10.1109/ICCV.2013.370In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction errors. The image boundaries are first extracted via super pixels as likely cues for background templates, from which dense and sparse appearance ...
- ArticleNovember 2013
Living without Menu Bar: A Shape Retrieval Based Word Editor
ACPR '13: Proceedings of the 2013 2nd IAPR Asian Conference on Pattern RecognitionPage 746https://doi.org/10.1109/ACPR.2013.194Shape descriptor plays very important role in shape retrieval system especially in the case of input shapes are drawn by hand. A good descriptor should be not only deformation tolerant but also compact and less memory consuming. With this in mind, we ...
- ArticleJune 2013
Saliency Detection via Graph-Based Manifold Ranking
CVPR '13: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern RecognitionPages 3166–3173https://doi.org/10.1109/CVPR.2013.407Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead ...
- articleNovember 2012
Human body segmentation based on deformable models and two-scale superpixel
Pattern Analysis & Applications (PAAS), Volume 15, Issue 4Pages 399–413https://doi.org/10.1007/s10044-011-0220-3In this paper, we propose a novel method to segment human body in static images by graph cuts based on two deformable models at two-scale superpixel. In our study, body segmentation is decomposed into torso detection and lower body recovery. Based on ...
- research-articleMarch 2012
Multilinear Supervised Neighborhood Embedding of a Local Descriptor Tensor for Scene/Object Recognition
IEEE Transactions on Image Processing (TIP), Volume 21, Issue 3Pages 1314–1326https://doi.org/10.1109/TIP.2011.2168417In this paper, we propose to represent an image as a local descriptor tensor and use a multilinear supervised neighborhood embedding (MSNE) for discriminant feature extraction, which is able to be used for subject or scene recognition. The contributions ...