[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Detailed Real-Time Urban 3D Reconstruction from Video

Published: 01 July 2008 Publication History

Abstract

The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU's to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a two-step stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.

References

[1]
Akbarzadeh, A., Frahm, J.-M., Mordohai, P., Clipp, B., Engels, C., Gallup, D., et al. (2006). Towards urban 3D reconstruction from video. In Proceedings of international symposium on 3D data, processing, visualization and transmission.
[2]
American Society of Photogrammetry. (2004). Manual of photogrammetry (5th ed.). Asprs Pubns.
[3]
Azarbayejani, A., & Pentland, A. P. (1995). Recursive estimation of motion, structure, and focal length. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(6), 562-575.
[4]
Baker, S., Gross, R., Matthews, I., & Ishikawa, T. (2003). Lucas-Kanade 20 years on: a unifying framework: part 2 (Technical Report CMU-RI-TR-03-01). Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, February 2003.
[5]
Beardsley, P., Zisserman, A., & Murray, D. (1997). Sequential updating of projective and affine structure from motion. International Journal of Computer Vision, 23(3), 235-259.
[6]
Biber, P., Fleck, S., Staneker, D., Wand, M., & Strasser, W. (2005). First experiences with a mobile platform for flexible 3d model acquisition in indoor and outdoor environments--the waggle. In ISPRS working group V/4: 3D-ARCH.
[7]
Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In International conference on computer vision (pp. 489-495).
[8]
Bosse, M., Rikoski, R., Leonard, J., & Teller, S. (2003). Vanishing points and 3d lines from omnidirectional video. The Visual Computer, 19(6), 417-430.
[9]
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222-1239.
[10]
Brown, R. G., & Hwang, P. Y. C. (1997). Introduction to random signals and applied Kalman filtering (3rd ed.). New York: Wiley.
[11]
Burt, P., Wixson, L., & Salgian, G. (1995). Electronically directed "focal" stereo. In International conference on computer vision (pp. 94-101).
[12]
Collins, R. T. (1996). A space-sweep approach to true multiimage matching. In International conference on computer vision and pattern recognition (pp. 358-363).
[13]
Cornelis, N., Cornelis, K., & Van Gool, L. (2006). Fast compact city modeling for navigation pre-visualization. In International conference on computer vision and pattern recognition.
[14]
Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In SIGGRAPH (Vol. 30, pp. 303-312).
[15]
El-Hakim, S. F., Beraldin, J.-A., Picard, M., & Vettore, A. (2003). Effective 3d modeling of heritage sites. In 4th international conference of 3D imaging and modeling (pp. 302-309).
[16]
Faugeras, O. D. (1993). Three-dimensional computer vision: a geometric viewpoint. Cambridge: MIT Press.
[17]
Faugeras, O., Luong, Q.-T., & Maybank, S. (1992). Camera self-calibration: theory and experiments. In European conference on computer vision (pp. 321-334). Berlin: Springer.
[18]
Fischer, A., Kolbe, T. H., Lang, F., Cremers, A. B., Förstner, W., Plümer, L., & Steinhage, V. (1998). Extracting buildings from aerial images using hierarchical aggregation in 2D and 3D. Computer Vision and Image Understanding, 72(2), 185-203.
[19]
Fitzgibbon, A., & Zisserman, A. (1998). Automatic camera recovery for closed or open image sequences. In European conference on computer vision (pp. 311-326).
[20]
Früh, C., & Zakhor, A. (2004). An automated method for large-scale, ground-based city model acquisition. International Journal of Computer Vision, 60(1), 5-24.
[21]
Fua, P. V. (1997). From multiple stereo views to multiple 3-D surfaces. International Journal of Computer Vision, 24(1), 19-35.
[22]
Gallup, D., Frahm, J.-M., Mordohai, P., Yang, Q., & Pollefeys, M. (2007). Real-time plane-sweeping stereo with multiple sweeping directions. In International conference on computer vision and pattern recognition.
[23]
Garland, M., & Heckbert, P. S. (1997). Surface simplification using quadric error metrics. In SIGGRAPH '97 (pp. 209-216).
[24]
Goesele, M., Curless, B., & Seitz, S. M. (2006). Multi-view stereo revisited. Computer Vision and Pattern Recognition, 2, 2402-2409.
[25]
Grewal, M. S., & Andrews, A. P. (2001). Kalman filtering theory and practice using MATLAB (2nd ed.). New York: Wiley.
[26]
Gruen, A., & Wang, X. (1998). Cc-modeler: a topology generator for 3-D city models. ISPRS Journal of Photogrammetry & Remote Sensing, 53(5), 286-295.
[27]
Hartley, R. I., & Sturm, P. (1997). Triangulation. Computer Vision and Image Understanding, 68(2), 146-157.
[28]
Hartley, R., & Zisserman, A. (2000). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
[29]
Hilton, A., Stoddart, A. J., Illingworth, J., & Windeatt, T. (1996). Reliable surface reconstruction from multiple range images. In European conference on computer vision (pp. 117-126).
[30]
Hoiem, D., Efros, A. A., & Hebert, M. (2006). Putting objects in perspective. In International conference on computer vision and pattern recognition (pp. 2137-2144).
[31]
Jin, H., Favaro, P., & Soatto, S. (2001). Real-time feature tracking and outlier rejection with changes in illumination. In International conference on computer vision (pp. 684-689).
[32]
Kang, S. B., Szeliski, R., & Chai, J. (2001). Handling occlusions in dense multi-view stereo. In International conference on computer vision and pattern recognition (pp. 103-110).
[33]
Kim, S. J., Gallup, D., Frahm, J.-M., Akbarzadeh, A., Yang, Q., Yang, R., Nistér, D., & Pollefeys, M. (2007). Gain adaptive real-time stereo streaming. In International conference on vision systems.
[34]
Koch, R., Pollefeys, M., & Van Gool, L. J. (1998). Multi viewpoint stereo from uncalibrated video sequences. In European conference on computer vision (Vol. I, pp. 55-71).
[35]
Koch, R., Pollefeys, M., & Van Gool, L. (1999). Robust calibration and 3D geometric modeling from large collections of uncalibrated images. In DAGM (pp. 413-420).
[36]
Lucas, B. D., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In International joint conference on artificial intelligence (pp. 674-679).
[37]
Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J.-M., Nister, D., & Pollefeys, M. (2007). Real-time visibility-based fusion of depth maps. In Proceedings of international conference on computer vision.
[38]
Morency, L. P., Rahimi, A., & Darrell, T. J. (2002). Fast 3D model acquisition from stereo images. In 3D data processing, visualization and transmission (pp. 172-176).
[39]
Morris, D. D., & Kanade, T. (2000). Image-consistent surface triangulation. In International conference on computer vision and pattern recognition (Vol. I, pp. 332-338).
[40]
Narayanan, P. J., Rander, P. W., & Kanade, T. (1998). Constructing virtual worlds using dense stereo. In International conference on computer vision (pp. 3-10).
[41]
Nistér, D. (2003). Preemptive RANSAC for live structure and motion estimation. In International conference on computer vision (Vol. 1, pp. 199-206).
[42]
Nistér, D. (2004). An efficient solution to the five-point relative pose problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6), 756-777.
[43]
Nistér, D., Naroditsky, O., & Bergen, J. (2006). Visual odometry for ground vehicle applications. Journal of Field Robotics, 23(1), 3- 20.
[44]
Ogale, A. S., & Aloimonos, Y. (2004). Stereo correspondence with slanted surfaces: critical implications of horizontal slant. In International conference on computer vision and pattern recognition (pp. 568-573).
[45]
Pajarola, R. (2002) Overview of quadtree-based terrain triangulation and visualization (Technical Report UCI-ICS-02-01). Information & Computer Science, University of California Irvine.
[46]
Pajarola, R., Meng, Y., & Sainz, M. (2002). Fast depth-image meshing and warping (Technical Report UCI-ECE-02-02). Information & Computer Science, University of California Irvine.
[47]
Pollefeys, M., Koch, R., & Van Gool, L. (1999). Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters. International Journal of Computer Vision, 32(1), 7-25.
[48]
Román, A., Garg, G., & Levoy, M. (2004). Interactive design of multiperspective images for visualizing urban landscapes. In IEEE visualization (pp. 537-544).
[49]
Rusinkiewicz, S., Hall-Holt, O., & Levoy, M. (2002). Real-time 3D model acquisition. ACM Transactions on Graphics, 21(3), 438- 446.
[50]
Sato, T., Kanbara, M., Yokoya, N., & Takemura, H. (2002). Dense 3-D reconstruction of an outdoor scene by hundreds-baseline stereo using a hand-held video camera. International Journal of Computer Vision, 47(1-3), 119-129.
[51]
Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(1-3), 7-42.
[52]
Schindler, G., & Dellaert, F. (2004). Atlanta world: an expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments. In International conference on computer vision and pattern recognition (pp. 203-209).
[53]
Schindler, G., Krishnamurthy, P., & Dellaert, F. (2006). Line-based structure from motion for urban environments. In 3DPVT.
[54]
Schindler, G., Dellaert, F., & Kang, S. B. (2007). Inferring temporal order of images from 3D structure. In International conference on computer vision and pattern recognition.
[55]
Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In International conference on computer vision and pattern recognition (pp. 519-528).
[56]
Shi, J., & Tomasi, C. (1994). Good features to track. In International conference on computer vision and pattern recognition (pp. 593- 600).
[57]
Sinha, S., Frahm, J.-M., Pollefeys, M., & Genc, Y. (2007). Feature tracking and matching in video using programmable graphics hardware. Machine Vision and Applications.
[58]
Soatto, S., Perona, P., Frezza, R., & Picci, G. (1993). Recursive motion and structure estimation with complete error characterization. In International conference on computer vision and pattern recognition (pp. 428-433).
[59]
Soucy, M., & Laurendeau, D. (1995). A general surface approach to the integration of a set of range views. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(4), 344-358.
[60]
Stamos, I., & Allen, P. K. (2002). Geometry and texture recovery of scenes of large scale. Computer Vision and Image Understanding, 88(2), 94-118.
[61]
Stewénius, H., Nistér, D., Oskarsson, M., & Åström, K. (2005). Solutions to minimal generalized relative pose problems. In Workshop on omnidirectional vision, Beijing, China, October 2005.
[62]
Szeliski, R., & Scharstein, D. (2004). Sampling the disparity space image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(3), 419-425.
[63]
Teller, S., Antone, M., Bodnar, Z., Bosse, M., Coorg, S., Jethwa, M., & Master, N. (2003). Calibrated, registered images of an extended urban area. International Journal of Computer Vision, 53(1), 93- 107.
[64]
Turk, G., & Levoy, M. (1994). Zippered polygon meshes from range images. In SIGGRAPH (pp. 311-318).
[65]
Werner, T., & Zisserman, A. (2002). New techniques for automated architectural reconstruction from photographs. In European conference on computer vision (pp. 541-555).
[66]
Wheeler, M. D., Sato, Y., & Ikeuchi, K. (1998). Consensus surfaces for modeling 3D objects from multiple range images. In International conference on computer vision (pp. 917-924).
[67]
Yang, R., & Pollefeys, M. (2003). Multi-resolution real-time stereo on commodity graphics hardware. In International conference on computer vision and pattern recognition (pp. 211-217).
[68]
Zabulis, X., & Daniilidis, K. (2004). Multi-camera reconstruction based on surface normal estimation and best viewpoint selection. In 3DPVT.
[69]
Zhu, Z., Hanson, A. R., & Riseman, E. M. (2004). Generalized parallel-perspective stereo mosaics from airborne video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 226- 237.

Cited By

View all
  • (2024)FPRFProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i3.28054(2750-2758)Online publication date: 20-Feb-2024
  • (2024)MVImgNet2.0: A Larger-scale Dataset of Multi-view ImagesACM Transactions on Graphics10.1145/368797343:6(1-16)Online publication date: 19-Dec-2024
  • (2024)CBARF: Cascaded Bundle-Adjusting Neural Radiance Fields From Imperfect Camera PosesIEEE Transactions on Multimedia10.1109/TMM.2024.338892926(9304-9315)Online publication date: 15-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Computer Vision
International Journal of Computer Vision  Volume 78, Issue 2-3
July 2008
160 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 July 2008

Author Tags

  1. 3D reconstruction
  2. Depth map fusion
  3. Large scale modeling
  4. Plane sweeping
  5. Stereo vision
  6. Structure from motion
  7. Urban reconstruction

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)FPRFProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i3.28054(2750-2758)Online publication date: 20-Feb-2024
  • (2024)MVImgNet2.0: A Larger-scale Dataset of Multi-view ImagesACM Transactions on Graphics10.1145/368797343:6(1-16)Online publication date: 19-Dec-2024
  • (2024)CBARF: Cascaded Bundle-Adjusting Neural Radiance Fields From Imperfect Camera PosesIEEE Transactions on Multimedia10.1109/TMM.2024.338892926(9304-9315)Online publication date: 15-Apr-2024
  • (2024)Local All-Pair Correspondence for Point TrackingComputer Vision – ECCV 202410.1007/978-3-031-72684-2_18(306-325)Online publication date: 29-Sep-2024
  • (2024)GS-LRM: Large Reconstruction Model for 3D Gaussian SplattingComputer Vision – ECCV 202410.1007/978-3-031-72670-5_1(1-19)Online publication date: 29-Sep-2024
  • (2023)Multi-Scale Geometric Consistency Guided and Planar Prior Assisted Multi-View StereoIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.320007445:4(4945-4963)Online publication date: 1-Apr-2023
  • (2023)Characterization of Out-of-distribution Samples from Uncertainty Maps Using Supervised Machine LearningPattern Recognition10.1007/978-3-031-54605-1_17(260-274)Online publication date: 19-Sep-2023
  • (2022)Large-scale outdoor scene reconstruction and correction with visionInternational Journal of Robotics Research10.1177/027836492093705241:6(637-663)Online publication date: 1-May-2022
  • (2021)Aerial path planning for online real-time exploration and offline high-quality reconstruction of large-scale urban scenesACM Transactions on Graphics10.1145/3478513.348049140:6(1-16)Online publication date: 10-Dec-2021
  • (2021)ROSEFusionACM Transactions on Graphics10.1145/3450626.345967640:4(1-17)Online publication date: 19-Jul-2021
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media