Applied computing

Applied Filters

People

Publications

Publication Date

Searched The ACM Guide to Computing Literature (3,815,653 records)|Limit your search to The ACM Full-Text Collection (772,220 records)

Showing 1 - 20of189 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
May 2024
PersonMAE: Person Re-Identification Pre-Training With Masked AutoEncoders
IEEE Transactions on Multimedia (TOM), Volume 26Pages 10029–10040https://doi.org/10.1109/TMM.2024.3405649
Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID). We argue that a high-quality ReID representation should have three properties, namely, multi-level awareness, occlusion ...
0
Metrics
Total Citations0
research-article
May 2024
MuJo-SF: Multimodal Joint Slot Filling for Attribute Value Prediction of E-Commerce Commodities
IEEE Transactions on Multimedia (TOM), Volume 26Pages 10354–10366https://doi.org/10.1109/TMM.2024.3407667
Supplementing product attribute information is a critical step for E-commerce platforms, which further benefits various downstream tasks, including product recommendation, product search, and product knowledge graph construction. Intuitively, the visual ...
0
Metrics
Total Citations0
research-article
May 2024
DanceComposer: Dance-to-Music Generation Using a Progressive Conditional Music Generator
IEEE Transactions on Multimedia (TOM), Volume 26Pages 10237–10250https://doi.org/10.1109/TMM.2024.3405734
A wonderful piece of music is the essence and soul of dance, which motivates the study of automatic music generation for dance. To create appropriate music from dance, cross-modal correlations between dance and music such as rhythm and style, should be ...
0
Metrics
Total Citations0
research-article
May 2024
Difference-Aware Distillation for Semantic Segmentation
- Jianping Gou,
- Xiabin Zhou,
- Lan Du,
- Yibing Zhan,
- Wu Chen,
- Zhang Yi
IEEE Transactions on Multimedia (TOM), Volume 26Pages 10069–10080https://doi.org/10.1109/TMM.2024.3405619
In recent years, various distillation methods for semantic segmentation have been proposed. However, these methods typically train the student model to imitate the intermediate features or logits of the teacher model directly, thereby overlooking the high-...
0
Metrics
Total Citations0
research-article
May 2024
SGDM: An Adaptive Style-Guided Diffusion Model for Personalized Text to Image Generation
IEEE Transactions on Multimedia (TOM), Volume 26Pages 9804–9813https://doi.org/10.1109/TMM.2024.3399075
The existing personalized text-to-image generation models face issues such as repeated training and insufficient generalization capabilities. We present an adaptive Style-Guided Diffusion Model (SGDM). When provided with a set of stylistically consistent ...
1
Metrics
Total Citations1
research-article
May 2024
UniDCP: Unifying Multiple Medical Vision-Language Tasks via Dynamic Cross-Modal Learnable Prompts
IEEE Transactions on Multimedia (TOM), Volume 26Pages 9736–9748https://doi.org/10.1109/TMM.2024.3397191
Medical vision-language pre-training (Med-VLP) models have recently accelerated the fast-growing medical diagnostics application. However, most Med-VLP models learn task-specific representations independently from scratch, thereby leading to great ...
1
Metrics
Total Citations1
research-article
May 2024
DSIS-DPR:Structured Instance Segmentation and Diffusion Prior Refinement for Dental Anatomy Learning
IEEE Transactions on Multimedia (TOM), Volume 26Pages 9464–9476https://doi.org/10.1109/TMM.2024.3394777
Instance segmentation in medical imaging plays a crucial role in clinical diagnostic tasks, and have shown promising performance in practical applications. In this article, we discuss a more fine-grained instance segmentation task: dental structured ...
0
Metrics
Total Citations0
research-article
May 2024
A Category-Aware Curriculum Learning for Data-Free Knowledge Distillation
- Xiufang Li,
- Licheng Jiao,
- Qigong Sun,
- Fang Liu,
- Xu Liu,
- Lingling Li,
- Puhua Chen,
- Shuyuan Yang
IEEE Transactions on Multimedia (TOM), Volume 26Pages 9603–9618https://doi.org/10.1109/TMM.2024.3395844
Constructing effective proxy data is one of the core challenges in data-free knowledge distillation. The existing models ignore the influence of the category entanglement of the generated data on the distillation. To alleviate this issue, imitating the ...
0
Metrics
Total Citations0
research-article
April 2024
Music-Driven Choreography Based on Music Feature Clusters and Dynamic Programming
IEEE Transactions on Multimedia (TOM), Volume 26Pages 9330–9341https://doi.org/10.1109/TMM.2024.3390232
Generating choreography from music poses a significant challenge. Conventional dance generation methods are limited by only being able to match specific dance movements to music with corresponding rhythms, restricting the utilization of existing dance ...
0
Metrics
Total Citations0
research-article
April 2024
PGCN: Pyramidal Graph Convolutional Network for EEG Emotion Recognition
IEEE Transactions on Multimedia (TOM), Volume 26Pages 9070–9082https://doi.org/10.1109/TMM.2024.3385676
Emotion recognition is essential in the diagnosis and rehabilitation of various mental diseases. In the last decade, electroencephalogram (EEG)-based emotion recognition has been intensively investigated due to its prominative accuracy and reliability, ...
0
Metrics
Total Citations0
research-article
April 2024
Deepfake Detection Fighting Against Noisy Label Attack
IEEE Transactions on Multimedia (TOM), Volume 26Pages 9047–9059https://doi.org/10.1109/TMM.2024.3385286
The face manipulation technique such as Deepfake has been widely used to create realistic faces, which raises growing concerns in the community. Based on the correct labeled data, the current Deepfake detectors are mostly trained on the clean dataset, ...
1
Metrics
Total Citations1
research-article
March 2024
Cross-Domain Low-Dose CT Image Denoising With Semantic Preservation and Noise Alignment
IEEE Transactions on Multimedia (TOM), Volume 26Pages 8771–8782https://doi.org/10.1109/TMM.2024.3382509
Deep learning (DL)-based Low-dose CT (LDCT) image denoising methods may face domain shift problem, where data from different domains (i.e., hospitals) may have similar anatomical regions but exhibit different intrinsic noise characteristics. Therefore, we ...
0
Metrics
Total Citations0
research-article
May 2024
Self-Similarity Prior Distillation for Unsupervised Remote Physiological Measurement
- Xinyu Zhang,
- Weiyu Sun,
- Hao Lu,
- Ying Chen,
- Yun Ge,
- Xiaolin Huang,
- Jie Yuan,
- Yingcong Chen
IEEE Transactions on Multimedia (TOM), Volume 26Pages 10290–10305https://doi.org/10.1109/TMM.2024.3405720
Remote photoplethysmography (rPPG) is a non-invasive technique that aims to capture subtle variations in facial pixels caused by changes in blood volume resulting from cardiac activities. Most existing unsupervised methods for rPPG tasks focus on the ...
0
Metrics
Total Citations0
research-article
May 2024
TextAdapter: Self-Supervised Domain Adaptation for Cross-Domain Text Recognition
IEEE Transactions on Multimedia (TOM), Volume 26Pages 9854–9865https://doi.org/10.1109/TMM.2024.3400669
Text recognition remains challenging, primarily due to the scarcity of annotated real data or the hard labor to annotate large-scale real data. Most existing solutions rely on synthetic training data, where the synthetic-to-real domain gaps limit the ...
0
Metrics
Total Citations0
research-article
April 2024
Context-Guided Black-Box Attack for Visual Tracking
IEEE Transactions on Multimedia (TOM), Volume 26Pages 8824–8835https://doi.org/10.1109/TMM.2024.3382473
With the recent advancement of deep neural networks, visual tracking has achieved substantial progress in tracking accuracy. However, the robustness and security of tracking methods developed based on current deep models have not been thoroughly explored, ...
0
Metrics
Total Citations0
research-article
March 2024
TTS: Hilbert Transform-Based Generative Adversarial Network for Tattoo and Scene Text Spotting
IEEE Transactions on Multimedia (TOM), Volume 26Pages 8226–8241https://doi.org/10.1109/TMM.2024.3378458
Text spotting in natural scenes is of increasing interest and significance due to its critical role in several applications, such as visual question answering, named entity recognition and event rumor detection on social media. One of the newly emerging ...
0
Metrics
Total Citations0
research-article
October 2023
Multi-Task Paired Masking With Alignment Modeling for Medical Vision-Language Pre-Training
IEEE Transactions on Multimedia (TOM), Volume 26Pages 4706–4721https://doi.org/10.1109/TMM.2023.3325965
In recent years, the growing demand for medical imaging diagnosis has placed a significant burden on radiologists. As a solution, Medical Vision-Language Pre-training (Med-VLP) methods have been proposed to learn universal representations from medical ...
1
Metrics
Total Citations1
research-article
October 2023
The Beauty of Repetition: An Algorithmic Composition Model With Motif-Level Repetition Generator and Outline-to-Music Generator in Symbolic Music Generation
IEEE Transactions on Multimedia (TOM), Volume 26Pages 4320–4333https://doi.org/10.1109/TMM.2023.3321495
Most musical compositions utilize repetition as a fundamental element to create captivating aesthetic experiences. However, the potential of repetition in machine-learning-based algorithmic composition has not been thoroughly investigated. This article ...
1
Metrics
Total Citations1
research-article
September 2023
Reversible Data Hiding-Based Contrast Enhancement With Multi-Group Stretching for ROI of Medical Image
IEEE Transactions on Multimedia (TOM), Volume 26Pages 3909–3923https://doi.org/10.1109/TMM.2023.3318048
Reversible data hiding-based contrast enhancement (RDHCE) can be used in contrast enhancement for medical images, and it has been a popular research topic in recent years. However, the existing RDHCE methods suffer from the problem of inaccurate ...
0
Metrics
Total Citations0
research-article
September 2023
Distortion-Aware Self-Supervised Indoor 360<inline-formula><tex-math notation="LaTeX">$^{\circ }$</tex-math></inline-formula> Depth Estimation via Hybrid Projection Fusion and Structural Regularities
IEEE Transactions on Multimedia (TOM), Volume 26Pages 3998–4011https://doi.org/10.1109/TMM.2023.3318470
Owing to the rapid development of emerging 360<inline-formula><tex-math notation="LaTeX">$^{\circ }$</tex-math></inline-formula> panoramic imaging techniques, indoor 360<inline-formula><tex-math notation="LaTeX">$^{\circ }$</tex-math></inline-formula> ...
0
Metrics
Total Citations0

Applied Filters

People

Names

Institutions

Authors

Editors

Publications

All Publications

Content Type

Publisher

Publication Date

PersonMAE: Person Re-Identification Pre-Training With Masked AutoEncoders

MuJo-SF: Multimodal Joint Slot Filling for Attribute Value Prediction of E-Commerce Commodities

DanceComposer: Dance-to-Music Generation Using a Progressive Conditional Music Generator

Difference-Aware Distillation for Semantic Segmentation

SGDM: An Adaptive Style-Guided Diffusion Model for Personalized Text to Image Generation

UniDCP: Unifying Multiple Medical Vision-Language Tasks via Dynamic Cross-Modal Learnable Prompts

DSIS-DPR:Structured Instance Segmentation and Diffusion Prior Refinement for Dental Anatomy Learning

A Category-Aware Curriculum Learning for Data-Free Knowledge Distillation

Music-Driven Choreography Based on Music Feature Clusters and Dynamic Programming

PGCN: Pyramidal Graph Convolutional Network for EEG Emotion Recognition

Deepfake Detection Fighting Against Noisy Label Attack

Cross-Domain Low-Dose CT Image Denoising With Semantic Preservation and Noise Alignment

Self-Similarity Prior Distillation for Unsupervised Remote Physiological Measurement

TextAdapter: Self-Supervised Domain Adaptation for Cross-Domain Text Recognition

Context-Guided Black-Box Attack for Visual Tracking

TTS: Hilbert Transform-Based Generative Adversarial Network for Tattoo and Scene Text Spotting

Multi-Task Paired Masking With Alignment Modeling for Medical Vision-Language Pre-Training

The Beauty of Repetition: An Algorithmic Composition Model With Motif-Level Repetition Generator and Outline-to-Music Generator in Symbolic Music Generation

Reversible Data Hiding-Based Contrast Enhancement With Multi-Group Stretching for ROI of Medical Image

Distortion-Aware Self-Supervised Indoor 360<inline-formula><tex-math notation="LaTeX">$^{\circ }$</tex-math></inline-formula> Depth Estimation via Hybrid Projection Fusion and Structural Regularities