Visual knowledge: an attempt to explore machine creativity

267 Accesses
2 Citations
Explore all metrics

概要

长期以来困扰人工智能领域的一个问题是: 人工智能是否具有创造力, 或者说, 算法的推理过程是否可以具有创造性. 本文从思维科学的角度探讨人工智能创造力的问题. 首先, 列举形象思维推理的相关研究; 然后, 重点介绍一种特殊的视觉知识表示形式, 即视觉场景图; 最后, 详细介绍视觉场景图构造问题与潜在应用. 所有证据表明, 视觉知识和视觉思维不仅可以改善当前人工智能任务的性能, 而且可以用于机器创造力的实践.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Arnheim R, 1997. Visual Thinking. University of California Press, San Francisco, USA.
Book Google Scholar
Bau D, Zhu JY, Wulff J, et al., 2019. Seeing what a GAN cannot generate. Proc IEEE/CVF Int Conf on Computer Vision, p.4501–4510. https://doi.org/10.1109/ICCV.2019.00460
Google Scholar
Chen L, Zhang HW, Xiao J, et al., 2019. Counterfactual critic multi-agent training for scene graph generation. Proc IEEE/CVF Int Conf on Computer Vision, p.4612–4622. https://doi.org/10.1109/ICCV.2019.00471
Google Scholar
Denis M, 1991. Imagery and thinking. In: Cornoldi C, McDaniel MA (Eds.), Imagery and Cognition. Springer, New York, NY, USA, p.103–131. https://doi.org/10.1007/978-1-4684-6407-8_4
Chapter Google Scholar
Elgammal A, Liu BC, Elhoseiny M, et al., 2017. CAN: creative adversarial networks, generating “art” by learning about styles and deviating from style norms. https://arxiv.org/abs/1706.07068
Google Scholar
Gazzaniga MS, 1967. The split brain in man. Sci Am, 217(2): 24–29. https://doi.org/10.1038/scientificamerican0867-24
Article Google Scholar
Gu JX, Zhao HD, Lin Z, et al., 2019. Scene graph generation with external knowledge and image reconstruction. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1969–1978. https://doi.org/10.1109/CVPR.2019.00207
Google Scholar
Haurilet M, Roitberg A, Stiefelhagen R, 2019. It’s not about the journey; it’s about the destination: following soft paths under question-guidance for visual reasoning. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1930–1939. https://doi.org/10.1109/CVPR.2019.00203
Google Scholar
Herzig R, Bar A, Xu HJ, et al., 2020. Learning canonical representations for scene graph to image generation. 16^th European Conf on Computer Vision, p.210–227. https://doi.org/10.1007/978-3-030-58574-7_13
Google Scholar
Hudson DA, Manning CD, 2019. GQA: a new dataset for real-world visual reasoning and compositional question answering. https://arxiv.org/abs/1902.09506
Google Scholar
Johnson J, Gupta A, Li FF, 2018. Image generation from scene graphs. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1219–1228. https://doi.org/10.1109/CVPR.2018.00133
Google Scholar
Kolodner J, 2014. Case-Based Reasoning. Morgan Kaufmann, San Mateo, USA.
Google Scholar
Krishna R, Zhu YK, Groth O, et al., 2017. Visual genome: connecting language and vision using crowdsourced dense image annotations. Int J Comput Vis, 123(1):32–73. https://doi.org/10.1007/s11263-016-0981-7
Article MathSciNet Google Scholar
Li ML, Zareian A, Zeng Q, et al., 2020. Cross-media structured common space for multimedia event extraction. https://arxiv.org/abs/2005.02472
Book Google Scholar
Li YL, Xu L, Huang XJ, et al., 2019. HAKE: human activity knowledge engine. https://arxiv.org/abs/1904.06539v2
Google Scholar
Liu DQ, Zhang HW, Zha ZJ, et al., 2019. Referring expression grounding by marginalizing scene graph likelihood. https://arxiv.org/abs/1906.03561v1
Google Scholar
McCarthy J, Minsky ML, Rochester N, et al., 2006. A proposal for the Dartmouth summer research project on artificial intelligence. AI Mag, 27(4):12–14.
Google Scholar
Mittal G, Agrawal S, Agarwal A, et al., 2019. Interactive image generation using scene graphs. https://arxiv.org/abs/1905.03743
Google Scholar
Mu Z, Tang S, Tan J, et al., 2021. Disentangled motif-aware graph learning for phrase grounding. Proc 35^th AAAI Conf on Artificial Intelligence.
Google Scholar
Norcliffe-Brown W, Vafeais E, Parisot S, 2018. Learning conditioned graph structures for interpretable visual question answering. https://arxiv.org/abs/1806.07243v1
Google Scholar
Pan YH, 2019. On visual knowledge. Front Inform Technol Electron Eng, 20(8):1021–1025. https://doi.org/10.1631/FITEE.1910001
Article Google Scholar
Pan YH, 2020a. Miniaturized five fundamental issues about visual knowledge. Front Inform Technol Electron Eng, online. https://doi.org/10.1631/FITEE.2040000
Google Scholar
Pan YH, 2020b. Multiple knowledge representation of artificial intelligence. Engineering, 6(3):216–217. https://doi.org/10.1016/j.eng.2019.12.011
Article Google Scholar
Radford A, Metz L, Chintala S, 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. https://arxiv.org/abs/1511.06434
Google Scholar
Shen K, Wu LF, Xu FL, et al., 2020. Hierarchical attention based spatial-temporal graph-to-sequence learning for grounded video description. Proc 29^th Int Joint Conf on Artificial Intelligence, p.941–947. https://doi.org/10.24963/ijcai.2020/131
Google Scholar
Tripathi S, Bhiwandiwalla A, Bastidas A, et al., 2019. Using scene graph context to improve image generation. https://arxiv.org/abs/1901.03762
Google Scholar
Yang JW, Lu JS, Lee S, et al., 2018. Graph R-CNN for scene graph generation. Proc 15^th European Conf on Computer Vision, p.690–706. https://doi.org/10.1007/978-3-030-01246-5_41
Google Scholar
Yang X, Tang KH, Zhang HW, et al., 2019. Auto-encoding scene graphs for image captioning. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10677–10686. https://doi.org/10.1109/CVPR.2019.01094
Google Scholar
Yang XY, Mei T, Xu YQ, et al., 2016. Automatic generation of visual-textual presentation layout. ACM Trans Multim Comput Commun Appl, 12(2):33. https://doi.org/10.1145/2818709
Article Google Scholar
Yu RC, Li A, Morariu VI, et al., 2017. Visual relationship detection with internal and external linguistic knowledge distillation. Proc IEEE Int Conf on Computer Vision, p.1068–1076. https://doi.org/10.1109/ICCV.2017.121
Google Scholar
Zareian A, Karaman S, Chang SF, 2020. Weakly supervised visual semantic parsing. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.3733–3742. https://doi.org/10.1109/CVPR42600.2020.00379
Google Scholar
Zhang HW, Kyaw Z, Chang SF, et al., 2017. Visual translation embedding network for visual relation detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.3107–3115. https://doi.org/10.1109/CVPR.2017.331
Google Scholar
Zhang W, Wang XE, Tang S, et al., 2020. Relational graph learning for grounded video description generation. Proc 28^th ACM Int Conf on Multimedia, p.3807–3828. https://doi.org/10.1145/3394171.3413746
Chapter Google Scholar
Zhang W, Shi H, Tang S, et al., 2021. Consensus graph representation learning for better grounded image captioning. Proc 35^th AAAI Conf on Artificial Intelligence.
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
Yueting Zhuang (庄越挺) & Siliang Tang (汤斯亮)

Authors

Yueting Zhuang (庄越挺)
View author publications
You can also search for this author in PubMed Google Scholar
Siliang Tang (汤斯亮)
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Yueting ZHUANG provided the main idea and outlined the manuscript. Siliang TANG drafted the manuscript. Yueting ZHUANG and Siliang TANG revised and finalized the paper.

Corresponding author

Correspondence to Yueting Zhuang (庄越挺).

Ethics declarations

Yueting ZHUANG and Siliang TANG declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhuang, Y., Tang, S. Visual knowledge: an attempt to explore machine creativity. Front Inform Technol Electron Eng 22, 619–624 (2021). https://doi.org/10.1631/FITEE.2100116

Download citation

Received: 03 March 2021
Accepted: 21 April 2021
Published: 28 May 2021
Issue Date: May 2021
DOI: https://doi.org/10.1631/FITEE.2100116

概要

Access this article

Subscribe and save

Buy Now

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Rights and permissions

About this article

Cite this article

关键词

Subscribe and save

Buy Now

Navigation

Visual knowledge: an attempt to explore machine creativity

概要

Access this article

Subscribe and save

Buy Now

Explore related subjects

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Rights and permissions

About this article

Cite this article

Share this article

关键词

Subscribe and save

Buy Now

Search

Navigation