Author: Ruan, Xiang : Search

Applied Filters

Publication Date

26 Results for: Author: Ruan, XiangEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,855,380 records)|Limit your search to The ACM Full-Text Collection (777,925 records)

Showing 1 - 20of26 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
January 2024
Deep Boosting Learning: A Brand-New Cooperative Approach for Image-Text Matching
IEEE Transactions on Image Processing (TIP), Volume 33Pages 3341–3352https://doi.org/10.1109/TIP.2024.3396063
Image-text matching remains a challenging task due to heterogeneous semantic diversity across modalities and insufficient distance separability within triplets. Different from previous approaches focusing on enhancing multi-modal representations or ...
0
Metrics
Total Citations0
research-article
July 2023
High-Performance Transformer Tracking
- Xin Chen,
- Bin Yan,
- Jiawen Zhu,
- Huchuan Lu,
- Xiang Ruan,
- Dong Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 45, Issue 7Pages 8507–8523https://doi.org/10.1109/TPAMI.2022.3232535
Correlation has a critical role in the tracking field, especially in recent popular Siamese-based trackers. The correlation operation is a simple fusion method that considers the similarity between the template and the search region. However, the ...
2
Metrics
Total Citations2
research-article
April 2023
Transformer vision-language tracking via proxy token guided cross-modal fusion
Pattern Recognition Letters (PTRL), Volume 168, Issue CPages 10–16https://doi.org/10.1016/j.patrec.2023.02.023
Highlights

Proxy token guided transformer-based baseline for vision-language tracking.
Dense annotated long-term vision-language tracking dataset.
Extensive experiments on a new long-term vision-language tracking dataset.

Abstract
Tracking by vision-language is an emergent topic. Previous researchers mainly adopt CNN and sequential models for video and language encoding, however, their methods are limited by poor generalization performance. To address this problem, this ...
6
Metrics
Total Citations6
research-article
January 2023
Plug-and-Play Regulators for Image-Text Matching
IEEE Transactions on Image Processing (TIP), Volume 32Pages 2322–2334https://doi.org/10.1109/TIP.2023.3266887
Exploiting fine-grained correspondence and visual-semantic alignments has shown great potential in image-text matching. Generally, recent approaches first employ a cross-modal attention unit to capture latent region-word interactions, and then integrate ...
4
Metrics
Total Citations4
research-article
October 2021
Weakly-Supervised Temporal Action Localization via Cross-Stream Collaborative Learning
- Yuan Ji,
- Xu Jia,
- Huchuan Lu,
- Xiang Ruan
MM '21: Proceedings of the 29th ACM International Conference on MultimediaPages 853–861https://doi.org/10.1145/3474085.3475261

Weakly supervised temporal action localization (WTAL) is a challenging task as only video-level category labels are available during training stage. Without precise temporal annotations, most approaches rely on complementary RGB and optical flow ...
19
447
Metrics
Total Citations19
Total Downloads447
Last 12 Months27
Last 6 weeks3
Get Access
Article
August 2020
CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss
Computer Vision – ECCV 2020Pages 316–331https://doi.org/10.1007/978-3-030-58558-7_19
Abstract
This paper proposes a hierarchical loss for monocular depth estimation, which measures the differences between the prediction and ground truth in hierarchical embedding spaces of depth maps. In order to find an appropriate embedding space, we ...
4
Metrics
Total Citations4
research-article
September 2017
Ranking Saliency
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 39, Issue 9Pages 1892–1904https://doi.org/10.1109/TPAMI.2016.2609426
Most existing bottom-up algorithms measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead ...
31
Metrics
Total Citations31
research-article
April 2017
Salient Object Detection via Multiple Instance Learning
IEEE Transactions on Image Processing (TIP), Volume 26, Issue 4Pages 1911–1922https://doi.org/10.1109/TIP.2017.2669878

Object proposals are a series of candidate segments containing the objects of interest, which are taken as preprocessing and widely applied in various vision tasks. However, most of existing saliency approaches only utilizes the proposals to compute a ...
27
Metrics
Total Citations27
research-article
January 2017
Co-Bootstrapping Saliency
IEEE Transactions on Image Processing (TIP), Volume 26, Issue 1Pages 414–425https://doi.org/10.1109/TIP.2016.2627804

In this paper, we propose a visual saliency detection algorithm to explore the fusion of various saliency models in a manner of bootstrap learning. First, an original bootstrapping model, which combines both weak and strong saliency models, is ...
5
Metrics
Total Citations5
research-article
November 2016
Video anomaly detection based on locality sensitive hashing filters
Pattern Recognition (PATT), Volume 59, Issue CPages 302–311https://doi.org/10.1016/j.patcog.2015.11.018

In this paper, we propose a novel anomaly detection approach based on Locality Sensitive Hashing Filters (LSHF), which hashes normal activities into multiple feature buckets with Locality Sensitive Hashing (LSH) functions to filter out abnormal ...
22
Metrics
Total Citations22
research-article
October 2016
Query Adaptive Instance Search using Object Sketches
MM '16: Proceedings of the 24th ACM international conference on MultimediaPages 1306–1315https://doi.org/10.1145/2964284.2964317

Sketch-based object search is a challenging problem mainly due to two difficulties: (1) how to match the binary sketch query with the colorful image, and (2) how to locate the small object in a big image with the sketch query. To address the above ...
20
259
Metrics
Total Citations20
Total Downloads259
Last 12 Months8
Last 6 weeks1
Get Access
research-article
April 2016
Dense and Sparse Reconstruction Error Based Saliency Descriptor
IEEE Transactions on Image Processing (TIP), Volume 25, Issue 4Pages 1592–1603https://doi.org/10.1109/TIP.2016.2524198

In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction error. The image boundaries are first extracted via superpixels as likely cues for background templates, from which dense and sparse appearance models ...
25
Metrics
Total Citations25
research-article
March 2016
Combining motion and appearance cues for anomaly detection
Pattern Recognition (PATT), Volume 51, Issue CPages 443–452https://doi.org/10.1016/j.patcog.2015.09.005

In this paper, we present a novel anomaly detection framework which integrates motion and appearance cues to detect abnormal objects and behaviors in video. For motion anomaly detection, we employ statistical histograms to model the normal motion ...
41
Metrics
Total Citations41
research-article
February 2016
Sketch retrieval via local dense stroke features
Image and Vision Computing (IAVC), Volume 46, Issue CPages 64–73https://doi.org/10.1016/j.imavis.2015.11.007

Sketch retrieval aims at retrieving the most similar sketches from a large database based on one hand-drawn query. Successful retrieval hinges on an effective representation of sketch images and an efficient search method. In this paper, we propose a ...
4
Metrics
Total Citations4
research-article
October 2015
Salient object detection via global and local cues
Pattern Recognition (PATT), Volume 48, Issue 10Pages 3258–3267https://doi.org/10.1016/j.patcog.2014.12.005

Previous saliency detection algorithms used to focus on low level features directly or utilize a bunch of sample images and manually labeled ground truth to train a high level learning model. In this paper, we propose a novel coding-based saliency ...
32
Metrics
Total Citations32
Article
December 2013
Saliency Detection via Dense and Sparse Reconstruction
ICCV '13: Proceedings of the 2013 IEEE International Conference on Computer VisionPages 2976–2983https://doi.org/10.1109/ICCV.2013.370

In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction errors. The image boundaries are first extracted via super pixels as likely cues for background templates, from which dense and sparse appearance ...
96
Metrics
Total Citations96
Article
November 2013
Living without Menu Bar: A Shape Retrieval Based Word Editor
ACPR '13: Proceedings of the 2013 2nd IAPR Asian Conference on Pattern RecognitionPage 746https://doi.org/10.1109/ACPR.2013.194

Shape descriptor plays very important role in shape retrieval system especially in the case of input shapes are drawn by hand. A good descriptor should be not only deformation tolerant but also compact and less memory consuming. With this in mind, we ...
0
Metrics
Total Citations0
Article
June 2013
Saliency Detection via Graph-Based Manifold Ranking
CVPR '13: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern RecognitionPages 3166–3173https://doi.org/10.1109/CVPR.2013.407

Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead ...
260
Metrics
Total Citations260
article
November 2012
Human body segmentation based on deformable models and two-scale superpixel
Pattern Analysis & Applications (PAAS), Volume 15, Issue 4Pages 399–413https://doi.org/10.1007/s10044-011-0220-3

In this paper, we propose a novel method to segment human body in static images by graph cuts based on two deformable models at two-scale superpixel. In our study, body segmentation is decomposed into torso detection and lower body recovery. Based on ...
2
Metrics
Total Citations2
research-article
March 2012
Multilinear Supervised Neighborhood Embedding of a Local Descriptor Tensor for Scene/Object Recognition
IEEE Transactions on Image Processing (TIP), Volume 21, Issue 3Pages 1314–1326https://doi.org/10.1109/TIP.2011.2168417

In this paper, we propose to represent an image as a local descriptor tensor and use a multilinear supervised neighborhood embedding (MSNE) for discriminant feature extraction, which is able to be used for subject or scene recognition. The contributions ...
3
Metrics
Total Citations3

Search Results

Applied Filters

Publication Date

People

Authors

Institutions

Publications

Journal/Magazine Names

All Publications

Content Type

Publisher

Proceedings Series

ACM SIG Sponsors

Results

Deep Boosting Learning: A Brand-New Cooperative Approach for Image-Text Matching

High-Performance Transformer Tracking

Transformer vision-language tracking via proxy token guided cross-modal fusion

Plug-and-Play Regulators for Image-Text Matching

Weakly-Supervised Temporal Action Localization via Cross-Stream Collaborative Learning

CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss

Ranking Saliency

Salient Object Detection via Multiple Instance Learning

Co-Bootstrapping Saliency

Video anomaly detection based on locality sensitive hashing filters

Query Adaptive Instance Search using Object Sketches

Dense and Sparse Reconstruction Error Based Saliency Descriptor

Combining motion and appearance cues for anomaly detection

Sketch retrieval via local dense stroke features

Salient object detection via global and local cues

Saliency Detection via Dense and Sparse Reconstruction

Living without Menu Bar: A Shape Retrieval Based Word Editor

Saliency Detection via Graph-Based Manifold Ranking

Human body segmentation based on deformable models and two-scale superpixel

Multilinear Supervised Neighborhood Embedding of a Local Descriptor Tensor for Scene/Object Recognition