[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

DeepFormableTag: end-to-end generation and recognition of deformable fiducial markers

Published: 19 July 2021 Publication History

Abstract

Fiducial markers have been broadly used to identify objects or embed messages that can be detected by a camera. Primarily, existing detection methods assume that markers are printed on ideally planar surfaces. The size of a message or identification code is limited by the spatial resolution of binary patterns in a marker. Markers often fail to be recognized due to various imaging artifacts of optical/perspective distortion and motion blur. To overcome these limitations, we propose a novel deformable fiducial marker system that consists of three main parts: First, a fiducial marker generator creates a set of free-form color patterns to encode significantly large-scale information in unique visual codes. Second, a differentiable image simulator creates a training dataset of photorealistic scene images with the deformed markers, being rendered during optimization in a differentiable manner. The rendered images include realistic shading with specular reflection, optical distortion, defocus and motion blur, color alteration, imaging noise, and shape deformation of markers. Lastly, a trained marker detector seeks the regions of interest and recognizes multiple marker patterns simultaneously via inverse deformation transformation. The deformable marker creator and detector networks are jointly optimized via the differentiable photorealistic renderer in an end-to-end manner, allowing us to robustly recognize a wide range of deformable markers with high accuracy. Our deformable marker system is capable of decoding 36-bit messages successfully at ~29 fps with severe shape deformation. Results validate that our system significantly outperforms the traditional and data-driven marker methods. Our learning-based marker system opens up new interesting applications of fiducial markers, including cost-effective motion capture of the human body, active 3D scanning using our fiducial markers' array as structured light patterns, and robust augmented reality rendering of virtual objects on dynamic surfaces.

Supplementary Material

VTT File (3450626.3459762.vtt)
ZIP File (a67-yaldiz.zip)
a67-yaldiz.zip
MP4 File (a67-yaldiz.mp4)
MP4 File (3450626.3459762.mp4)
Presentation.

References

[1]
Shumeet Baluja. 2017. Hiding images in plain sight: Deep steganography. In The Conference and Workshop on Neural Information Processing Systems. 2069--2079.
[2]
Ross Bencina, Martin Kaltenbrunner, and Sergi Jorda. 2005. Improved topological fiducial tracking in the reactivision system. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 99--99.
[3]
Filippo Bergamasco, Andrea Albarelli, Emanuele Rodola, and Andrea Torsello. 2011. Rune-tag: A high accuracy fiducial marker with strong occlusion resilience. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 113--120.
[4]
Joseph DeGol, Timothy Bretl, and Derek Hoiem. 2017. ChromaTag: a colored marker and fast detection algorithm. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 1472--1481.
[5]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Ieee, 248--255.
[6]
Denso Wave. 1994. Quick Response (QR) code. https://d1wqtxts1xzle7.cloudfront.net/51791265/Three_QR_Code.pdf
[7]
Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. 2018. Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 224--236.
[8]
Jean Duchon. 1977. Splines minimizing rotation-invariant semi-norms in Sobolev spaces. In Constructive Theory of Functions of Several Variables, Walter Schempp and Karl Zeller (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 85--100.
[9]
Mark Fiala. 2005. ARTag, a fiducial marker system using digital techniques. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2. IEEE, 590--596.
[10]
John G Fryer and Duane C Brown. 1986. Lens distortion for close-range photogrammetry. Photogrammetric engineering and remote sensing 52, 1 (1986), 51--58.
[11]
Sergio Garrido-Jurado, Rafael Muñoz-Salinas, Francisco José Madrid-Cuevas, and Manuel Jesús Marín-Jiménez. 2014. Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognition 47, 6 (2014), 2280--2292.
[12]
Sergio Garrido-Jurado,Rafael Munoz-Salinas, Francisco José Madrid-Cuevas, and Rafael Medina-Carnicer. 2016. Generation of fiducial marker dictionaries using mixed integer linear programming. Pattern Recognition 51 (2016), 481--491.
[13]
Oleg Grinchuk, Vadim Lebedev, and Victor Lempitsky. 2016. Learnable visual markers. In The Conference and Workshop on Neural Information Processing Systems. 4143--4151.
[14]
Jamie Hayes and George Danezis. 2017. Generating steganographic images via adversarial training. In The Conference and Workshop on Neural Information Processing Systems. 1954--1963.
[15]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2961--2969.
[16]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.
[17]
Danying Hu, Daniel DeTone, and Tomasz Malisiewicz. 2019. Deep charuco: Dark charuco marker pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8436--8444.
[18]
Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. 2015. Spatial Transformer Networks. In The Conference and Workshop on Neural Information Processing Systems. 2017--2025. http://papers.nips.cc/paper/5854-spatial-transformer-networks
[19]
Neil F Johnson and Sushil Jajodia. 1998. Exploring steganography: Seeing the unseen. Computer 31, 2 (1998), 26--34.
[20]
Jan Kallwies, Bianca Forkel, and Hans-Joachim Wuensche. 2020. Determining and Improving the Localization Accuracy of AprilTag Detection. In IEEE International Conference on Robotics and Automation (ICRA). IEEE, 8288--8294.
[21]
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In International Conference on Learning Representations. https://openreview.net/forum?id=Hk99zCeAb
[22]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4401--4410.
[23]
Hirokazu Kato and Mark Billinghurst. 1999. Marker tracking and hmd calibration for a video-based augmented reality conferencing system. In Proceedings 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR'99). IEEE, 85--94.
[24]
Maximilian Krogius, Acshi Haggenmiller, and Edwin Olson. 2019. Flexible Layouts for Fiducial Tags. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 1898--1903.
[25]
Youngwan Lee, Joong-won Hwang, Sangrok Lee, Yuseok Bae, and Jongyoul Park. 2019. An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[26]
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2117--2125.
[27]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the European conference on computer vision (ECCV). Springer, 740--755.
[28]
Guilin Liu, Fitsum A Reda, Kevin J Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018. Image inpainting for irregular holes using partial convolutions. In Proceedings of the European conference on computer vision (ECCV). 85--100.
[29]
Rafael Munoz-Salinas. 2012. Aruco: a minimal library for augmented reality applications based on opencv. Universidad de Córdoba (2012).
[30]
Leonid Naimark and Eric Foxlin. 2002. Circular data matrix fiducial system and robust image processing for a wearable vision-inertial self-tracker. In Proceedings. International Symposium on Mixed and Augmented Reality. IEEE, 27--36.
[31]
Gaku Narita, Yoshihiro Watanabe, and Masatoshi Ishikawa. 2016. Dynamic projection mapping onto deforming non-rigid surface using deformable dot cluster marker. IEEE transactions on visualization and computer graphics 23, 3 (2016), 1235--1248.
[32]
Edwin Olson. 2011. AprilTag: A robust and flexible visual fiducial system. In 2011 IEEE International Conference on Robotics and Automation. IEEE, 3400--3407.
[33]
OpenCV. 2020. Open Source Computer Vision Library. https://opencv.org/. Version 4.2.0.
[34]
John Peace, Eric Psota, Yanfeng Liu, and Lance C. Pérez. 2020. E2ETag: An End-to-End Trainable Method for Generating and Detecting Fiducial Markers. In British Machine Vision Conference (BMVC).
[35]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In The Conference and Workshop on Neural Information Processing Systems. 91--99.
[36]
Francisco J Romero-Ramirez, Rafael Muñoz-Salinas, and Rafael Medina-Carnicer. 2018. Speeded up detection of squared fiducial markers. Image and vision Computing 76 (2018), 38--47.
[37]
Matthew Tancik, Ben Mildenhall, and Ren Ng. 2020. Stegastamp: Invisible hyperlinks in physical photographs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2117--2126.
[38]
Weixuan Tang, Shunquan Tan, Bin Li, and Jiwu Huang. 2017. Automatic steganographic distortion learning using a generative adversarial network. IEEE Signal Processing Letters 24, 10 (2017), 1547--1551.
[39]
Hideaki Uchiyama and Eric Marchand. 2011. Deformable random dot markers. In 2011 10th IEEE International Symposium on Mixed and Augmented Reality. IEEE, 237--238.
[40]
Bruce Walter, Stephen R Marschner, Hongsong Li, and Kenneth E Torrance. 2007. Microfacet Models for Refraction through Rough Surfaces. Rendering techniques 2007 (2007), 18th.
[41]
John Wang and Edwin Olson. 2016. AprilTag 2: Efficient and robust fiducial detection. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 4193--4198.
[42]
Eric Wengrowski and Kristin Dana. 2019. Light field messaging with deep photographic steganography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1515--1524.
[43]
Pin Wu, Yang Yang, and Xiaoqiang Li. 2018. Stegnet: Mega image steganography capacity with deep convolutional network. Future Internet 10, 6 (2018), 54.
[44]
Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. 2019. Detectron2. https://github.com/facebookresearch/detectron2.
[45]
Anqi Xu and Gregory Dudek. 2011. Fourier tag: A smoothly degradable fiducial marker system with configurable payload capacity. In Canadian Conference on Computer and Robot Vision. IEEE, 40--47.
[46]
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2018. Generative image inpainting with contextual attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5505--5514.
[47]
Jiren Zhu, Russell Kaplan, Justin Johnson, and Li Fei-Fei. 2018. Hidden: Hiding data with deep networks. In Proceedings of the European conference on computer vision (ECCV). 657--672.

Cited By

View all
  • (2024)CylinderTag: An Accurate and Flexible Marker for Cylinder-Shape Objects Pose Estimation Based on Projective InvariantsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.335090130:12(7486-7499)Online publication date: 1-Dec-2024
  • (2024)YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers2024 33rd IEEE International Conference on Robot and Human Interactive Communication (ROMAN)10.1109/RO-MAN60168.2024.10731319(311-316)Online publication date: 26-Aug-2024
  • (2024)Uncovering the Metaverse within Everyday Environments: A Coarse-to-Fine ApproachBehaviors2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00074(499-509)Online publication date: 2-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 40, Issue 4
August 2021
2170 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3450626
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2021
Published in TOG Volume 40, Issue 4

Check for updates

Author Tags

  1. deep learning
  2. fiducial marker system
  3. object detection
  4. tracking

Qualifiers

  • Research-article

Funding Sources

  • Samsung Research Funding Center of Samsung Electronics
  • Korea NRF grant
  • MSRA
  • MSIT/IITP of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)171
  • Downloads (Last 6 weeks)27
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)CylinderTag: An Accurate and Flexible Marker for Cylinder-Shape Objects Pose Estimation Based on Projective InvariantsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.335090130:12(7486-7499)Online publication date: 1-Dec-2024
  • (2024)YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers2024 33rd IEEE International Conference on Robot and Human Interactive Communication (ROMAN)10.1109/RO-MAN60168.2024.10731319(311-316)Online publication date: 26-Aug-2024
  • (2024)Uncovering the Metaverse within Everyday Environments: A Coarse-to-Fine ApproachBehaviors2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00074(499-509)Online publication date: 2-Jul-2024
  • (2023)Fiducial Objects: Custom Design and EvaluationSensors10.3390/s2324964923:24(9649)Online publication date: 6-Dec-2023
  • (2023)Soft Tissue Monitoring of the Surgical Field: Detection and Tracking of Breast Surface DeformationsIEEE Transactions on Biomedical Engineering10.1109/TBME.2022.323390970:7(2002-2012)Online publication date: Jul-2023
  • (2023)Neural Lens Modeling2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.00815(8435-8445)Online publication date: Jun-2023
  • (2023)Multiple Projector Camera Calibration by Fiducial Marker DetectionIEEE Access10.1109/ACCESS.2023.329985711(78945-78955)Online publication date: 2023
  • (2022)NeuralMarkerACM Transactions on Graphics10.1145/3550454.355546841:6(1-10)Online publication date: 30-Nov-2022
  • (2022)InfraredTags: Embedding Invisible AR Markers and Barcodes Using Low-Cost, Infrared-Based 3D Printing and Imaging ToolsProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501951(1-12)Online publication date: 29-Apr-2022
  • (2022)Connecting Everyday Objects with the Metaverse: A Unified Recognition Framework2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC54236.2022.00063(401-406)Online publication date: Jun-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media