Abstract
This article addresses the problem of creating interactive mixed reality applications where virtual objects interact in images of real world scenarios. This is relevant to create games and architectural or space planning applications that interact with visual elements in the images such as walls, floors and empty spaces. These scenarios are intended to be captured by the users with regular cameras or using previously taken photographs. Introducing virtual objects in photographs presents several challenges, such as pose estimation and the creation of a visually correct interaction between virtual objects and the boundaries of the scene. The two main research questions addressed in this article include, the study of the feasibility of creating interactive augmented reality (AR) applications where virtual objects interact in a real world scenario using the image detected high-level features and, also, verifying if untrained users are capable and motivated enough to perform AR initialization steps. The proposed system detects the scene automatically from an image with additional features obtained using basic annotations from the user. This operation is significantly simple to accommodate the needs of non-expert users. The system analyzes one or more photos captured by the user and detects high-level features such as vanishing points, floor and scene orientation. Using these features it will be possible to create mixed and augmented reality applications where the user interactively introduces virtual objects that blend with the picture in real time and respond to the physical environment. To validate the solution several system tests are described and compared using available external image datasets.
Similar content being viewed by others
Notes
York Urban Line Segment Database, http://www.elderlab.yorku.ca/YorkUrbanDB/.
Flickr, Photo sharing website, http://www.flickr.com.
Eye Pet, Sony Playstation camera game, http://www.eyepet.com/
Atelier Pfister, smartphone application for furniture design, http://www.atelierpfister.ch/app.
Layar, GPS augmented reality application, http://www.layar.com/.
Google Project Tango, http://www.google.com/atap/projecttango/.
Ikea 2014 catalogue, http://www.ikea.com/ms/en_AA/customer_service/catalogue/catalogue_2014.html.
References
ARToolKit (2003) http://www.hitl.washington.edu/artoolkit/. (last access October 2013)
Azuma R (1997) A survey of augmented reality. Presence-Teleoperators and Virtual Environments, MIT Press 4:355–385
Bunnun P, Damen D, Calway A, Mayol-Cuevas W (2012) Integrating 3D object detection, modelling and tracking on a mobile phone. In: Proceedings of the 2012 IEEE international symposium on mixed and augmented reality (ISMAR’12). IEEE Computer Society, Atlanta, pp 273– 274
Canny J (1986) A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 8(6):679–698
Coughlan JM, Yuille AL (1999) Manhattan world : compass direction from a single Image by bayesian inference. In: Proceedings of the international conference on computer vision (ICCV ’99), vol 2. IEEE Computer Society, Kerkyra, pp 1–10
Criminisi A, Reid I, Zisserman A (2000) Single view metrology. Int J Comput Vis, Springer 40(2):123–148
Del Pero L, Guan J, Brau E, Schlecht J, Barnard K (2011) Sampling bedrooms. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’11). IEEE Computer Society, Colorado Springs, pp 2009–2016
Delong A, Boykov Y (2009) Globally optimal segmentation of multi-region objects. In: Proceedings of the IEEE 12th international conference on computer vision (ICCV’09). IEEE Computer Society, Kyoto, pp 285–292
Fischler MA, Bolles RC (1981) Random sample consensus: a para- digm for model Fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Fite-Georgel P (2011) Is there a reality in industrial augmented reality?. In: Proceedings of the 2011 10th IEEE international symposium on mixed and augmented reality (ISMAR’11). IEEE Computer Society, Basel, pp 201–210
Forsyth D (2013) Understanding pictures of rooms. Commun ACM 56(4):91
Furukawa Y, Curless B, Seitz SM, Szeliski R (2009) Manhattan-world stereo. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’09). IEEE Computer Society, Miami, pp 1422–1429
Gioi RGV, Jakubowicz J, Morel JM, Randall G (2008) On straight line segment detection. Journal of Mathematical Imaging and Vision, Springer 32(3):1–45
Gould S, Fulton R, Koller D (2009) Decomposing a scene into geometric and semantically consistent regions. In: Proceedings of the IEEE 12th international conference on computer vision (ICCV’09). IEEE Computer Society, Kyoto, pp 1–8
Gupta A, Satkin S, Efros A, Hebert M (2011) From 3D scene geometry to human workspace. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’11). IEEE Computer Society, Colorado Springs, pp 1961–1968
Hedau V, Hoiem D (2009) Recovering the spatial layout of cluttered rooms. In: Proceedings of the IEEE 12th international conference on computer vision (ICCV’09). IEEE Computer Society, Kyoto, pp 1849–1856
Hedau V, Hoiem D, Forsyth D (2012) Recovering free space of indoor scenes from a single image. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’12). IEEE Computer Society, Providence, pp 2807–2814
Hoiem D, Efros A.a, Hebert M (2007). Recovering surface layout from an image. International Journal of Computer Vision, Springer 75(1):151–172
Hough PVC Method and means for recognizing complex patterns
Karsch K, Hedau V, Forsyth D (2011) Rendering synthetic objects into legacy photographs. ACM Transactions on Graphics (TOG) 30(6):1–12
Klein G, Murray D (2007) Parallel tracking and mapping for small AR workspaces. In: Proceedings of the 6th IEEE and ACM international symposium on mixed and augmented reality (ISMAR’07), vol. 07. IEEE Computer Society, Nara, pp 1–10
Lee DC, Hebert M, Kanade T (2009) Geometric reasoning for single image structure recovery. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’09). IEEE Computer Society, Miami, pp 2136–2143
Lee D, Gupta A, Hebert M (2010) Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. NIPS Foundation 1:1–9
Li B, Peng K, Ying X, Zha H (2012) Vanishing point detection using cascaded 1D Hough Transform from single images. Pattern Recogn Lett, Elsevier 33 (1):1–8
Liu B, Gould S (2010) Single image depth estimation from predicted semantic labels. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’10). IEEE Computer Society, San Francisco
Liu H, Jiang S, Huang Q, Xu C (2008) A generic virtual content insertion system based on visual attention analysis. In: Proceedings of the 16th ACM international conference on multimedia (MM ’08). ACM, Vancouver, pp 379–388
Metaio (2013) http://www.metaio.com/. (last access October 2013)
Milgram P, Takemura H, Ustimi A, Kishino F (1994) Augmented reality: a class of display on the reality-virtuality continuum. Telemanipulator and Telepresence Technologies 2351:282–292
Mulloni A, Seichter H, Schmalstieg D (2012) Indoor navigation with mixed reality world-in-miniature views and sparse localization on mobile devices. In: Proceedings of the 2012 international working conference on advaced visual interfaces (AVI’12). ACM, Capri Island, pp 212–215
Nguyen V, Tran M, Le T, Bui Q, Duong A (2012) Augmented media for traditional magazines. In: Proceedings of the third symposium on information and communication technology (SoICT ’12). ACM, Da Nang, pp 97–106
Nóbrega R, Correia N (2012) Magnetic augmented reality: virtual objects in your space. In: Proceedings of the 2012 international working conference on advanced visual interfaces (AVI’12). ACM, Capri Island, pp 332–335
Nóbrega R., Correia N. (2013) Photo-based multimedia applications using image features detection. In: Proceedings of international conference on computer graphics theory and applications (GRAPP’13). INSTICC Press, Barcelona, pp 298–307
Nóbrega R, Correia N (2014) Dynamic Insertion of virtual objects in photographs. Int J Creative Interfaces Comput Graph (IJCICG) 4(2):22–39
OpenCV (2013) http://opencv.org. (last access October 2013)
openFrameworks (2013) http://www.openframeworks.cc. (last access October 2013)
Rother C (2002) A new approach to vanishing point detection in architectural environments. Image Vis Comput, Elsevier 20(1):647–655
Rother C, Kolmogorov V (2004) GrabCut Interactive foreground extraction using iterated Graph Cuts. ACM Transactions on Graphics (TOG)
Saxena A, Sun M, Ng AY (2009) Make3D: learning 3D scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell (PAMI) 31(5):824–40
Simon G. (2006) Automatic online walls detection for immediate use in AR tasks. In: Proceedings IEEE international symposium on mixed and augmented reality (ISMAR’06). IEEE Computer Society, Santa Barbara, pp 4–7
Simon G, Berger MO (2002) Pose estimation for planar structures. IEEE Comput Graph Appl 22(6):46–53
Simon G, Fitzgibbon AW, Zisserman A (2000) Markerless tracking using planar structures in the scene. In: Proceedings IEEE and ACM international symposium on augmented reality (ISAR’00), vol 9. IEEE Computer Society, Munich, pp 120–128
StudiertubeTracker (2011) http://handheldar.icg.tugraz.at/stbtracker.php. (last access October 2013)
Tillon AB, Marchal I (2011) Mobile augmented reality in the museum: Can a lace-like technology take you closer to works of art?. In: Proceedings of the IEEE international symposium on mixed and augmented reality - arts, media, and humanities (ISMAR-AMH). IEEE Computer Society, Basel, pp 41–47
Uchiyama H (2011) Toward augmenting everything: Detecting and tracking geometrical features on planar objects. In: Proceedings of the 2011 10th IEEE international symposium on mixed and augmented reality (ISMAR’11). IEEE Computer Society, Basel, pp 17–25
Uchiyama H, Teichrieb V, Marchand E (2012) Texture-less planar object detection and pose estimation using depth-assisted rectification of contours. In: Proceedings of the 2012 IEEE international symposium on mixed and augmented reality (ISMAR’12). IEEE Computer Society, Atlanta, pp 297–298
Vallino J (1998) Interactive augmented reality
von Gioi R, Jakubowicz J, Randall G (2007) Multisegment detection. In: Proceedings of the IEEE international conference on image processing (ICIP’07). IEEE Computer Society, San Antonio, pp 1–4
Vuforia (2013) https://www.vuforia.com/. (last access October 2013)
Wagner D, Schmalstieg D, Bischof H (2009) Multiple target detection and tracking with guaranteed framerates on mobile phones. In: Proceedings of the 8th IEEE international symposium on mixed and augmented reality (ISMAR’09). IEEE Computer Society, Orlando, pp 57–64
Wagner D, Reitmayr G, Mulloni A, Drummond T, Schmalstieg D (2010) Real-time detection and tracking for augmented reality on mobile phones. IEEE Trans Vis Comput Graph 16(3):355–368
Xiong X, Munoz D, Bagnell JA, Hebert M (2011) 3-D scene analysis via sequenced predictions over points and regions. In: Proceedings of the IEEE international conference on robotics and automation (ICRA’11). IEEE Computer Society, Shanghai, pp 2609–2616
Acknowledgments
The authors would like to thank the support from everyone at IMG and CITI. This work was funded by the Portuguese Science and Technology Foundation, FCT/MEC, through grants SFRH/BD/47511/2008, PEst-OE/EEI/UI0527/2011 (CITI/FCT/UNL now NOVA-LINCS) and to the MAT Project. The Media Arts and Technologies project (MAT), NORTE-07-0124-FEDER-000061, is financed by the North Portugal Regional Operational Programme (ON.2 O Novo Norte), under the National Strategic Reference Framework (NSRF), through the European Regional Development Fund (ERDF), and by national funds, through the Portuguese funding agency, Fundao para a Cincia e a Tecnologia (FCT). The authors also thank the Project I-City for Future Mobility: NORTE-07-0124-FEDER-000064, and European Project FP7 Future Cities: FP7-REGPOT-2012-2013-1.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nóbrega, R., Correia, N. Interactive 3D content insertion in images for multimedia applications. Multimed Tools Appl 76, 163–197 (2017). https://doi.org/10.1007/s11042-015-3031-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-3031-5