More Web Proxy on the site http://driver.im/

article

Detailed Real-Time Urban 3D Reconstruction from Video

International Journal of Computer Vision, Volume 78, Issue 2-3

Pages 143 - 167

https://doi.org/10.1007/s11263-007-0086-4

Published: 01 July 2008 Publication History

Abstract

The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU's to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a two-step stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.

References

[1]

Akbarzadeh, A., Frahm, J.-M., Mordohai, P., Clipp, B., Engels, C., Gallup, D., et al. (2006). Towards urban 3D reconstruction from video. In Proceedings of international symposium on 3D data, processing, visualization and transmission.

[2]

American Society of Photogrammetry. (2004). Manual of photogrammetry (5th ed.). Asprs Pubns.

[3]

Azarbayejani, A., & Pentland, A. P. (1995). Recursive estimation of motion, structure, and focal length. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(6), 562-575.

Digital Library

[4]

Baker, S., Gross, R., Matthews, I., & Ishikawa, T. (2003). Lucas-Kanade 20 years on: a unifying framework: part 2 (Technical Report CMU-RI-TR-03-01). Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, February 2003.

[5]

Beardsley, P., Zisserman, A., & Murray, D. (1997). Sequential updating of projective and affine structure from motion. International Journal of Computer Vision, 23(3), 235-259.

Digital Library

[6]

Biber, P., Fleck, S., Staneker, D., Wand, M., & Strasser, W. (2005). First experiences with a mobile platform for flexible 3d model acquisition in indoor and outdoor environments--the waggle. In ISPRS working group V/4: 3D-ARCH.

[7]

Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In International conference on computer vision (pp. 489-495).

[8]

Bosse, M., Rikoski, R., Leonard, J., & Teller, S. (2003). Vanishing points and 3d lines from omnidirectional video. The Visual Computer, 19(6), 417-430.

Digital Library

[9]

Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222-1239.

Digital Library

[10]

Brown, R. G., & Hwang, P. Y. C. (1997). Introduction to random signals and applied Kalman filtering (3rd ed.). New York: Wiley.

[11]

Burt, P., Wixson, L., & Salgian, G. (1995). Electronically directed "focal" stereo. In International conference on computer vision (pp. 94-101).

[12]

Collins, R. T. (1996). A space-sweep approach to true multiimage matching. In International conference on computer vision and pattern recognition (pp. 358-363).

[13]

Cornelis, N., Cornelis, K., & Van Gool, L. (2006). Fast compact city modeling for navigation pre-visualization. In International conference on computer vision and pattern recognition.

[14]

Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In SIGGRAPH (Vol. 30, pp. 303-312).

[15]

El-Hakim, S. F., Beraldin, J.-A., Picard, M., & Vettore, A. (2003). Effective 3d modeling of heritage sites. In 4th international conference of 3D imaging and modeling (pp. 302-309).

[16]

Faugeras, O. D. (1993). Three-dimensional computer vision: a geometric viewpoint. Cambridge: MIT Press.

[17]

Faugeras, O., Luong, Q.-T., & Maybank, S. (1992). Camera self-calibration: theory and experiments. In European conference on computer vision (pp. 321-334). Berlin: Springer.

[18]

Fischer, A., Kolbe, T. H., Lang, F., Cremers, A. B., Förstner, W., Plümer, L., & Steinhage, V. (1998). Extracting buildings from aerial images using hierarchical aggregation in 2D and 3D. Computer Vision and Image Understanding, 72(2), 185-203.

Digital Library

[19]

Fitzgibbon, A., & Zisserman, A. (1998). Automatic camera recovery for closed or open image sequences. In European conference on computer vision (pp. 311-326).

[20]

Früh, C., & Zakhor, A. (2004). An automated method for large-scale, ground-based city model acquisition. International Journal of Computer Vision, 60(1), 5-24.

Digital Library

[21]

Fua, P. V. (1997). From multiple stereo views to multiple 3-D surfaces. International Journal of Computer Vision, 24(1), 19-35.

[22]

Gallup, D., Frahm, J.-M., Mordohai, P., Yang, Q., & Pollefeys, M. (2007). Real-time plane-sweeping stereo with multiple sweeping directions. In International conference on computer vision and pattern recognition.

[23]

Garland, M., & Heckbert, P. S. (1997). Surface simplification using quadric error metrics. In SIGGRAPH '97 (pp. 209-216).

[24]

Goesele, M., Curless, B., & Seitz, S. M. (2006). Multi-view stereo revisited. Computer Vision and Pattern Recognition, 2, 2402-2409.

[25]

Grewal, M. S., & Andrews, A. P. (2001). Kalman filtering theory and practice using MATLAB (2nd ed.). New York: Wiley.

[26]

Gruen, A., & Wang, X. (1998). Cc-modeler: a topology generator for 3-D city models. ISPRS Journal of Photogrammetry & Remote Sensing, 53(5), 286-295.

[27]

Hartley, R. I., & Sturm, P. (1997). Triangulation. Computer Vision and Image Understanding, 68(2), 146-157.

Digital Library

[28]

Hartley, R., & Zisserman, A. (2000). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.

[29]

Hilton, A., Stoddart, A. J., Illingworth, J., & Windeatt, T. (1996). Reliable surface reconstruction from multiple range images. In European conference on computer vision (pp. 117-126).

[30]

Hoiem, D., Efros, A. A., & Hebert, M. (2006). Putting objects in perspective. In International conference on computer vision and pattern recognition (pp. 2137-2144).

[31]

Jin, H., Favaro, P., & Soatto, S. (2001). Real-time feature tracking and outlier rejection with changes in illumination. In International conference on computer vision (pp. 684-689).

[32]

Kang, S. B., Szeliski, R., & Chai, J. (2001). Handling occlusions in dense multi-view stereo. In International conference on computer vision and pattern recognition (pp. 103-110).

[33]

Kim, S. J., Gallup, D., Frahm, J.-M., Akbarzadeh, A., Yang, Q., Yang, R., Nistér, D., & Pollefeys, M. (2007). Gain adaptive real-time stereo streaming. In International conference on vision systems.

[34]

Koch, R., Pollefeys, M., & Van Gool, L. J. (1998). Multi viewpoint stereo from uncalibrated video sequences. In European conference on computer vision (Vol. I, pp. 55-71).

[35]

Koch, R., Pollefeys, M., & Van Gool, L. (1999). Robust calibration and 3D geometric modeling from large collections of uncalibrated images. In DAGM (pp. 413-420).

[36]

Lucas, B. D., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In International joint conference on artificial intelligence (pp. 674-679).

[37]

Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J.-M., Nister, D., & Pollefeys, M. (2007). Real-time visibility-based fusion of depth maps. In Proceedings of international conference on computer vision.

[38]

Morency, L. P., Rahimi, A., & Darrell, T. J. (2002). Fast 3D model acquisition from stereo images. In 3D data processing, visualization and transmission (pp. 172-176).

[39]

Morris, D. D., & Kanade, T. (2000). Image-consistent surface triangulation. In International conference on computer vision and pattern recognition (Vol. I, pp. 332-338).

[40]

Narayanan, P. J., Rander, P. W., & Kanade, T. (1998). Constructing virtual worlds using dense stereo. In International conference on computer vision (pp. 3-10).

[41]

Nistér, D. (2003). Preemptive RANSAC for live structure and motion estimation. In International conference on computer vision (Vol. 1, pp. 199-206).

[42]

Nistér, D. (2004). An efficient solution to the five-point relative pose problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6), 756-777.

Digital Library

[43]

Nistér, D., Naroditsky, O., & Bergen, J. (2006). Visual odometry for ground vehicle applications. Journal of Field Robotics, 23(1), 3- 20.

[44]

Ogale, A. S., & Aloimonos, Y. (2004). Stereo correspondence with slanted surfaces: critical implications of horizontal slant. In International conference on computer vision and pattern recognition (pp. 568-573).

[45]

Pajarola, R. (2002) Overview of quadtree-based terrain triangulation and visualization (Technical Report UCI-ICS-02-01). Information & Computer Science, University of California Irvine.

[46]

Pajarola, R., Meng, Y., & Sainz, M. (2002). Fast depth-image meshing and warping (Technical Report UCI-ECE-02-02). Information & Computer Science, University of California Irvine.

[47]

Pollefeys, M., Koch, R., & Van Gool, L. (1999). Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters. International Journal of Computer Vision, 32(1), 7-25.

Digital Library

[48]

Román, A., Garg, G., & Levoy, M. (2004). Interactive design of multiperspective images for visualizing urban landscapes. In IEEE visualization (pp. 537-544).

[49]

Rusinkiewicz, S., Hall-Holt, O., & Levoy, M. (2002). Real-time 3D model acquisition. ACM Transactions on Graphics, 21(3), 438- 446.

Digital Library

[50]

Sato, T., Kanbara, M., Yokoya, N., & Takemura, H. (2002). Dense 3-D reconstruction of an outdoor scene by hundreds-baseline stereo using a hand-held video camera. International Journal of Computer Vision, 47(1-3), 119-129.

[51]

Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(1-3), 7-42.

Digital Library

[52]

Schindler, G., & Dellaert, F. (2004). Atlanta world: an expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments. In International conference on computer vision and pattern recognition (pp. 203-209).

[53]

Schindler, G., Krishnamurthy, P., & Dellaert, F. (2006). Line-based structure from motion for urban environments. In 3DPVT.

[54]

Schindler, G., Dellaert, F., & Kang, S. B. (2007). Inferring temporal order of images from 3D structure. In International conference on computer vision and pattern recognition.

[55]

Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In International conference on computer vision and pattern recognition (pp. 519-528).

[56]

Shi, J., & Tomasi, C. (1994). Good features to track. In International conference on computer vision and pattern recognition (pp. 593- 600).

[57]

Sinha, S., Frahm, J.-M., Pollefeys, M., & Genc, Y. (2007). Feature tracking and matching in video using programmable graphics hardware. Machine Vision and Applications.

[58]

Soatto, S., Perona, P., Frezza, R., & Picci, G. (1993). Recursive motion and structure estimation with complete error characterization. In International conference on computer vision and pattern recognition (pp. 428-433).

[59]

Soucy, M., & Laurendeau, D. (1995). A general surface approach to the integration of a set of range views. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(4), 344-358.

Digital Library

[60]

Stamos, I., & Allen, P. K. (2002). Geometry and texture recovery of scenes of large scale. Computer Vision and Image Understanding, 88(2), 94-118.

Digital Library

[61]

Stewénius, H., Nistér, D., Oskarsson, M., & Åström, K. (2005). Solutions to minimal generalized relative pose problems. In Workshop on omnidirectional vision, Beijing, China, October 2005.

[62]

Szeliski, R., & Scharstein, D. (2004). Sampling the disparity space image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(3), 419-425.

Digital Library

[63]

Teller, S., Antone, M., Bodnar, Z., Bosse, M., Coorg, S., Jethwa, M., & Master, N. (2003). Calibrated, registered images of an extended urban area. International Journal of Computer Vision, 53(1), 93- 107.

[64]

Turk, G., & Levoy, M. (1994). Zippered polygon meshes from range images. In SIGGRAPH (pp. 311-318).

[65]

Werner, T., & Zisserman, A. (2002). New techniques for automated architectural reconstruction from photographs. In European conference on computer vision (pp. 541-555).

[66]

Wheeler, M. D., Sato, Y., & Ikeuchi, K. (1998). Consensus surfaces for modeling 3D objects from multiple range images. In International conference on computer vision (pp. 917-924).

[67]

Yang, R., & Pollefeys, M. (2003). Multi-resolution real-time stereo on commodity graphics hardware. In International conference on computer vision and pattern recognition (pp. 211-217).

[68]

Zabulis, X., & Daniilidis, K. (2004). Multi-camera reconstruction based on surface normal estimation and best viewpoint selection. In 3DPVT.

[69]

Zhu, Z., Hanson, A. R., & Riseman, E. M. (2004). Generalized parallel-perspective stereo mosaics from airborne video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 226- 237.

Digital Library

Cited By

Kim GYouwang KOh TWooldridge MDy JNatarajan S(2024)FPRFProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i3.28054(2750-2758)Online publication date: 20-Feb-2024
https://dl.acm.org/doi/10.1609/aaai.v38i3.28054
Wu YShi LLiu HLiao HQiu LYuan WGu XDong ZCui SHan X(2024)MVImgNet2.0: A Larger-scale Dataset of Multi-view ImagesACM Transactions on Graphics10.1145/368797343:6(1-16)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687973
Fu HYu XLi LZhang L(2024)CBARF: Cascaded Bundle-Adjusting Neural Radiance Fields From Imperfect Camera PosesIEEE Transactions on Multimedia10.1109/TMM.2024.338892926(9304-9315)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3388929
Show More Cited By

Index Terms

Detailed Real-Time Urban 3D Reconstruction from Video

Recommendations

Large-scale, dense city reconstruction from user-contributed photos

The goal of our work is to incrementally reconstruct terrestrial city models from standard digital camera images contributed by multiple users. Hence, the Wiki principle well known from textual knowledge databases is transferred to 3D computer vision. ...
Beyond Alhazen's problem: Analytical projection model for non-central catadioptric cameras with quadric mirrors
CVPR '11: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition

Catadioptric cameras are widely used to increase the field of view using mirrors. Central catadioptric systems having an effective single viewpoint are easy to model and use, but severely constraint the camera positioning with respect to the mirror. On ...
3D feature extraction from uncalibrated video clips
3DVP '10: Proceedings of the 1st international workshop on 3D video processing

This paper explores the idea of extracting a dense 3D point cloud corresponding to salient features in a video. The goal is to generate the dense point cloud efficiently, in order to use the information in various other video processing tasks. We ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Computer Vision

International Journal of Computer Vision Volume 78, Issue 2-3

July 2008

160 pages

ISSN:0920-5691

Issue’s Table of Contents

Copyright © Copyright © 2008 Springer Science+Business Media, LLC.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 July 2008

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

111
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kim GYouwang KOh TWooldridge MDy JNatarajan S(2024)FPRFProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i3.28054(2750-2758)Online publication date: 20-Feb-2024
https://dl.acm.org/doi/10.1609/aaai.v38i3.28054
Wu YShi LLiu HLiao HQiu LYuan WGu XDong ZCui SHan X(2024)MVImgNet2.0: A Larger-scale Dataset of Multi-view ImagesACM Transactions on Graphics10.1145/368797343:6(1-16)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687973
Fu HYu XLi LZhang L(2024)CBARF: Cascaded Bundle-Adjusting Neural Radiance Fields From Imperfect Camera PosesIEEE Transactions on Multimedia10.1109/TMM.2024.338892926(9304-9315)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3388929
Cho SHuang JNam JAn HKim SLee J(2024)Local All-Pair Correspondence for Point TrackingComputer Vision – ECCV 202410.1007/978-3-031-72684-2_18(306-325)Online publication date: 29-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-72684-2_18
Zhang KBi STan HXiangli YZhao NSunkavalli KXu Z(2024)GS-LRM: Large Reconstruction Model for 3D Gaussian SplattingComputer Vision – ECCV 202410.1007/978-3-031-72670-5_1(1-19)Online publication date: 29-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-72670-5_1
Xu QKong WTao WPollefeys M(2023)Multi-Scale Geometric Consistency Guided and Planar Prior Assisted Multi-View StereoIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.320007445:4(4945-4963)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1109/TPAMI.2022.3200074
Budde LBulatov DStrauss EQiu KIwaszczuk D(2023)Characterization of Out-of-distribution Samples from Uncertainty Maps Using Supervised Machine LearningPattern Recognition10.1007/978-3-031-54605-1_17(260-274)Online publication date: 19-Sep-2023
https://dl.acm.org/doi/10.1007/978-3-031-54605-1_17
Tanner MPiniés PPaz LSăftescu ŞBewley AJonasson ENewman P(2022)Large-scale outdoor scene reconstruction and correction with visionInternational Journal of Robotics Research10.1177/027836492093705241:6(637-663)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.1177/0278364920937052
Liu YCui RXie KGong MHuang H(2021)Aerial path planning for online real-time exploration and offline high-quality reconstruction of large-scale urban scenesACM Transactions on Graphics10.1145/3478513.348049140:6(1-16)Online publication date: 10-Dec-2021
https://dl.acm.org/doi/10.1145/3478513.3480491
Zhang JZhu CZheng LXu K(2021)ROSEFusionACM Transactions on Graphics10.1145/3450626.345967640:4(1-17)Online publication date: 19-Jul-2021
https://dl.acm.org/doi/10.1145/3450626.3459676
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents