default search action
Zijia Zhao
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c7]Wenxuan Wang, Yisi Zhang, Xingjian He, Yichen Yan, Zijia Zhao, Xinlong Wang, Jing Liu:
Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions. ACL (Findings) 2024: 762-776 - [c6]Tongtian Yue, Jie Cheng, Longteng Guo, Xingyuan Dai, Zijia Zhao, Xingjian He, Gang Xiong, Yisheng Lv, Jing Liu:
SC- Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models. CVPR 2024: 13073-13083 - [c5]Shichen Lu, Longteng Guo, Wenxuan Wang, Zijia Zhao, Tongtian Yue, Jing Liu, Si Liu:
Collaborative Training of Tiny-Large Vision Language Models. ACM Multimedia 2024: 4928-4937 - [i13]Wenxuan Wang, Yisi Zhang, Xingjian He, Yichen Yan, Zijia Zhao, Xinlong Wang, Jing Liu:
Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions. CoRR abs/2402.11265 (2024) - [i12]Tongtian Yue, Jie Cheng, Longteng Guo, Xingyuan Dai, Zijia Zhao, Xingjian He, Gang Xiong, Yisheng Lv, Jing Liu:
SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models. CoRR abs/2403.13263 (2024) - [i11]Yanyuan Qiao, Zheng Yu, Longteng Guo, Sihan Chen, Zijia Zhao, Mingzhen Sun, Qi Wu, Jing Liu:
VL-Mamba: Exploring State Space Models for Multimodal Learning. CoRR abs/2403.13600 (2024) - [i10]Zijia Zhao, Haoyu Lu, Yuqi Huo, Yifan Du, Tongtian Yue, Longteng Guo, Bingning Wang, Weipeng Chen, Jing Liu:
Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs. CoRR abs/2406.09367 (2024) - [i9]Yifan Du, Kun Zhou, Yuqi Huo, Yifan Li, Wayne Xin Zhao, Haoyu Lu, Zijia Zhao, Bingning Wang, Weipeng Chen, Ji-Rong Wen:
Towards Event-oriented Long Video Understanding. CoRR abs/2406.14129 (2024) - [i8]Erdong Hu, Longteng Guo, Tongtian Yue, Zijia Zhao, Shuning Xue, Jing Liu:
OneDiff: A Generalist Model for Image Difference Captioning. CoRR abs/2407.05645 (2024) - [i7]Yifan Du, Yuqi Huo, Kun Zhou, Zijia Zhao, Haoyu Lu, Han Huang, Wayne Xin Zhao, Bingning Wang, Weipeng Chen, Ji-Rong Wen:
Exploring the Design Space of Visual Context Representation in Video MLLMs. CoRR abs/2410.13694 (2024) - [i6]Han Huang, Yuqi Huo, Zijia Zhao, Haoyu Lu, Shu Wu, Bingning Wang, Qiang Liu, Weipeng Chen, Liang Wang:
Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining. CoRR abs/2410.16166 (2024) - [i5]Zijia Zhao, Longteng Guo, Tongtian Yue, Erdong Hu, Shuai Shao, Zehuan Yuan, Hua Huang, Jing Liu:
ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval. CoRR abs/2410.18715 (2024) - 2023
- [j1]Liang Zhao, Zijia Zhao, Enchao Zhang, Ammar Hawbani, Ahmed Yassin Al-Dubai, Zhiyuan Tan, Amir Hussain:
A Digital Twin-Assisted Intelligent Partial Offloading Approach for Vehicular Edge Computing. IEEE J. Sel. Areas Commun. 41(11): 3386-3400 (2023) - [c4]Sihan Chen, Handong Li, Qunbo Wang, Zijia Zhao, Mingzhen Sun, Xinxin Zhu, Jing Liu:
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset. NeurIPS 2023 - [c3]Wenbo Jia, Zijia Zhao, Wenzhuo Huang, Yangyang Li, Jie Ling, Bai Chen, Yayi Shen:
Snake-inspired Swarm Robot Design for Distributed Underwater Search and Rescue. ROBIO 2023: 1-6 - [c2]Zijia Zhao, Longteng Guo, Xingjian He, Shuai Shao, Zehuan Yuan, Jing Liu:
MAMO: Fine-Grained Vision-Language Representations Learning with Masked Multimodal Modeling. SIGIR 2023: 1528-1538 - [i4]Zijia Zhao, Longteng Guo, Tongtian Yue, Sihan Chen, Shuai Shao, Xinxin Zhu, Zehuan Yuan, Jing Liu:
ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst. CoRR abs/2305.16103 (2023) - [i3]Sihan Chen, Handong Li, Qunbo Wang, Zijia Zhao, Mingzhen Sun, Xinxin Zhu, Jing Liu:
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset. CoRR abs/2305.18500 (2023) - 2022
- [i2]Zijia Zhao, Longteng Guo, Xingjian He, Shuai Shao, Zehuan Yuan, Jing Liu:
MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning. CoRR abs/2210.04183 (2022) - 2021
- [c1]Sihan Chen, Xinxin Zhu, Dongze Hao, Wei Liu, Jiawei Liu, Zijia Zhao, Longteng Guo, Jing Liu:
MM21 Pre-training for Video Understanding Challenge: Video Captioning with Pretraining Techniques. ACM Multimedia 2021: 4853-4857 - [i1]Jing Liu, Xinxin Zhu, Fei Liu, Longteng Guo, Zijia Zhao, Mingzhen Sun, Weining Wang, Hanqing Lu, Shiyu Zhou, Jiajun Zhang, Jinqiao Wang:
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation. CoRR abs/2107.00249 (2021)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-28 20:24 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint